QUANTUM MECHANICS 
OF 
PARTICLES ano WAVE FIELDS 


ARTHUR MARCH 


DOVER PUBLICATIONS, INC. 
Mineola, New York 


Bibliographical Note 


This Dover edition, first published in 2006, is an unabridged republication of 
the work originally published by John Wiley & Sons, Inc., New York, 1951. 


International Standard Book Number: 0-486-44578-X 


Manufactured in the United States of America 
Dover Publications, Inc., 31 East 2nd Street, Mineola, N.Y. 11501 


Preface 


Since its inception nearly a quarter of a century ago, quantum 
mechanics has been the subject of many books. Notwithstanding 
this abundance of books, if someone, not a theoretical physicist and 
not thoroughly acquainted with modern methods of analysis, were to 
attempt to digest a current article dealing with nuclear forces or cos- 
mic rays, relying for help on the available books on the subject, he 
would discover to his dismay that modern quantum mechanics differs 
radically from that which he finds in the textbooks. Even the 
terminology is different. What, for example, is the pseudoscalar 
field which in some obscure way seems to be intimately associated with 
a particle? He will not find the answer in the information at his 
disposal. More confusing to him is the interpretation of a field in the 
new mechanics. It seems evident that what is meant is a real field 
possessing energy and momentum. Yet the textbooks attach a 
purely symbolic meaning to the wave field of a particle, picturing the 
field concept merely as a probability function. 

The whole difficulty arises from the fact that in recent years quan- 
tum mechanics, following relativistic laws, has reached certain con- 
clusions which are not in accord with its original ideas. It is true 
that the idea of representing the state of a system by a vector and any 
observable quantity by a linear operator is still the same, there being a 
special interpretation of both. But in relativistic quantum mechanics 
the vector representing a certain state is no longer determined with 
the aid of a wave equation. It was this belief in non-relativistic 
theory that led to the common belief—one still upheld by the text- 
books—that the waves encountered in wave mechanics have a purely 
constructive character. In relativistic theory this is not true. Only 
by the association of the particle with a real field, that is, one pos- 
sessing energy and momentum, was it possible to establish equations 
satisfying the requirements of relativity theory. It is assumed that 
this field is related to the particle in a way similar to the association of 
the photon with the electromagnetic field; that is, it is of importance 
only for those cases wherein the particle displays the properties of a 
wave motion. It is essential to realize that, as long as the field is not 
quantized, it is completely dissociated from the concept of a particle; 
it neither consists of particles nor admits of any quantity which might 

Vi 


vi PREFACE 


have a corpuscular interpretation. In order to achieve this distinctly 
one-sided description of the phenomena it is necessary to quantize 
the field, this being accomplished by transcribing the field quantities 
into matrices. This process brings about the appearance of particles 
and leads to that formalism which, making use of the property of non- 
commutability of certain matrices, unites the undulatory and the 
corpuscular nature into one consistent scheme. 

It is impossible to understand the methods of modern quantum 
mechanics without a knowledge of the way in which the theory has 
been developing. It is precisely for this reason that I have adopted as 
the main purpose of this book the presentation of the theory in such 
a manner that the reader will have adequate information whenever it 
is needed. To achieve this purpose, I felt that it was necessary to 
devote almost one half of the book to relativistic quantum mechanics, 
that is, the quantum mechanics of wave fields, the knowledge of which is 
indispensable for any physicist engaged in modern research work. As 
this part of quantum mechanics is considered especially difficult, I have 
taken the utmost care in explaining the fundamental ideas as clearly 
and understandably as possible. The reader should not only become 
acquainted with the mathematical formalism of the theory but also 
should first of all acquire a real understanding of its foundation. 
Criticism may be made that many important applications of the 
theory and methods of calculation are omitted, but the book was so 
planned as not to exceed a certain size and it seemed preferable to 
use the space for a thorough exposition of principles rather than for 
examples and methods which can be found in many existing textbooks. 

As the last chapter deals with the concept of a fundamental length, I 
feel that a word of justification for its introduction is needed. The 
method therein suggested for contending with the well-known funda- 
mental difficulties arising in quantum mechanics does not form a part 
of the currently adopted theory. Neither is there general agreement 
among physicists as to whether the concept of a constant is really 
indispensable for reaching a reasonable theory, nor is there agreement 
among those who support the idea as to how this new constant should 
be introduced into the theory. Thus it might seem that the intro- 
duction into the text of the concept of a constant which would limit 
the possibilities of observation is premature. And yet there is not the 
slightest doubt that in its present form quantum mechanics is of little 
or no use in the evaluation of nuclear processes unless the formalism 
is complemented with instruction as to how the divergent results of 
the theory can be made convergent. The universal length seems to 
be the simplest means that can be used to attain thisend. Since I first 


PREFACE vii 


introduced the idea of the universal length, an idea developed in 
several papers as early as 1936, I feel justified in presenting my views in 
this book. The procedure suggested in Chapter 10 has the advantages 
of being simple and permitting a plausible interpretation. More- 
over, in all those cases which are subject to calculation, it is in good 
agreement with the experimental facts. It is quite true that the 
method removes only those divergences which arise from the quantum- 
mechanical formalism; it does not affect those which are caused by 
the assumed punctiformity of the particles, a condition which already 
existed in classical theory. However, it certainly represents a step for- 
ward if we can handle successfully the difficulties first mentioned, diffi- 
culties by which theory is most hampered in its practical applications. 
Writing a book in a language which is not one’s mother tongue is 
always a venture. That it could succeed is due to Professor William 
Hurley of Fordham University, New York, who was kind enough to 
revise the wording and the formulas of the manuscript most carefully. 
I take this opportunity of expressing my deep gratitude to Professor 
Hurley. I am also indebted to my colleague, Mr. J. McDonaugh of 
the University of Innsbruck, who assisted me with valuable advice. 


A.M. 
Innsbruck, Austria 
March, 1951 


Contents 


1. Wave Mechanics of a Single Particle 1 


1. The fundamental idea of quantum mechanics. 2. Heisenberg’s 
uncertainty relations. Coordinate and momentum. 3. Heisenberg’s 
uncertainty relations. Method of Doppler effect. 4. The new me- 
chanics and the principle of causality. 5. de Broglie waves. 6. The 
method of wave packets. 7. Reconciliation of wave and classical 
mechanics. 8. The wave mechanics of a particle moving in a field of 
force. 9. The geometrical method of wave mechanics. 10. The 
scattering of probability waves by a nucleus. 


2. Wave Mechanics of Stationary States 35 

11. Schroedinger’s wave equation. 12. The experimental possibilities. 
13. States of undefined energy. 14. Wave mechanics and Bohr’s 
theory. 15. Expectation values of mechanical entities. 16. The 
principle of transformation. 17. The linear oscillator. 18. The 
hydrogen atom. 19. Discussion of the solution. Comparison with 
Bohr’s theory. 20. Wave mechanics and the correspondence principle. 
Transition probabilities. 


3. Wave Mechanics in Matrix Form : 83 
21. The idea of matrix mechanics. 22. The Hilbert space. Concept 
of matrix. 23. Addition and multiplication of matrices. 24. Dual, 
unitary, Hermitean matrices. 25. Transformation to principal axes. 
26. Functions of matrices. 27. The quantum-mechanical interpreta- 
tion of matrices. 28. The commutation relations. 29. Hermitean 
forms and expectation values. 30. Coordinate systems with a continu- 
ous infinity of axes. The passage from matrix to wave mechanics. 31. 
The fundamental problem of matrix mechanics. 32. Unique nature of 
the solution. 33. The dynamical law of quantum mechanics: The 
principle of causality. 34. Systems with many degrees of freedom. 


4, Perturbation Theory 127 


35. Perturbation of non-degenerated systems. 36. Perturbation of 
degenerated systems. 37. Perturbation as causing transitions. 


5. Systems of Many Particles 136 
38. Schroedinger’s equation for the many-body problem. 39. Sym- 
metric and antisymmetric solutions. 40. The exclusion principle 
relative to a combination of symmetric and antisymmetric states. 41. 
The helium atom. 42. Systems of many similar particles. Method 
of particle picture. 43. Systems of many similar particles. Method 
of wave picture. 44. Statistics of Bose-Einstein and Fermi-Dirac. 

ix 


x CONTENTS 


6. Relativistic Wave Equations 165 
45. Particles with spin }4. Dirac’s equation. 46. A particle in an 
electromagnetic field. Dirac’s hole theory. 47. Particles with spin 
0. The equation of Klein and Gordon. 48. Digression on tensor cal- 
culus. Pseudoscalar wave field. 49. Particles with spin 1. de 
Broglie and Proca’s equation. 50. The pseudovector field. 


7. Quantization of Wave Fields 191 
51. The idea of quantization. 52. Quantization of a scalar field. 53. 
Quantization of a vector field. 


8. Quantum Electrodynamics 204 

54. Classical theory. The field as a superposition of plane waves. 55. 
Transformation of the Hamiltonian. 56. Quantization of the field. 
57. Quantization of a system consisting of field and particles. 58. 
Interaction between radiation and matter. 59. Emission and absorp- 
tion of a light quantum by anatom. 60. The divergences occurring in 
the higher approximations. 


9. Wave Fields and Nuclear Matter 231 
61. The Lagrangian of the interaction. 62. Scalar and pseudoscalar 
fields. 63. Vector and pseudovector fields. 64. The potential of 
the nuclear forces. 65. Nuclear scattering of mesons. 66. Magnetic 
moment of proton and neutron. 67. Mesons in an electromagnetic 
field. 


10. Introduction of a Fundamental Length 267 
68. The idea of a fundamental length. 69. Introduction of Jy into the 
interaction terms. 70. Application to electrodynamics. 71. Brems- 
strahlung—transversal self-energy of an electron. 72. Application to 
the nuclear forces. 73. Nuclear scattering of mesons. Magnetic 
moment of proton and neutron. 74, Decay of negative mesons in 
light elements. 


Index 291 


QUANTUM MECHANICS 
OF 


PARTICLES anp WAVE FIELDS 


1 
WAVE MECHANICS 
OF A SINGLE PARTICLE 


1. The Fundamental Idea of Quantum Mechanics. Quantum 
mechanics was developed because of the failure of classical physics to 
account for atomic phenomena. According to classical physics, the 
state of a physical system can be determined to any desired degree of 
accuracy by measuring certain quantities, such as coordinates and 
velocities, with ideal apparatus and applying to these measurements 
certain principles by means of which the future states of the system 
can be predicted. However, when a system of exceedingly small mass 
is considered, the following difficulty is encountered: in the meas- 
uring of any observable quantity, the state of the object is disturbed 
by the measuring process in an unpredictable way and the state loses 
its determinacy relative to other quantities. If an attempt is made to 
compensate for this loss by measuring one of the other observable 
quantities, the knowledge of the first is lost. Thus we can never 
succeed in determining the simultaneous values of all the quantities 
that define the state of the system. 

Such a situation, which finds its expression in what are called the 
uncertainty relations, has frequently been interpreted as being due to 
an interaction between the observing subject and the object observed, 
which prevents a distinct separation of observer and object, thus 
depriving the object of its determinacy. Such an interpretation is 
inaccurate in that it is not the observer that interacts with the object 
but rather the measuring apparatus itself. This interaction implies 
that the state of the object cannot be measured without being changed 
by the process of that measurement, a fact which of itself is not 
new. Prior to quantum mechanics it was known that the value of an 
observable is influenced by the introduction of an instrument, for 
example a thermometer, into the system. This complication involved 
no difficulty, since the disturbance caused by the instrument was 
considered an effect that could be compensated for with the aid of 
classical theory. The viewpoint of quantum mechanics is different, 

1 


2 WAVE MECHANICS OF A SINGLE PARTICLE (Cu. 1 


It is based on the fundamental idea that any phenomenon that occurs 
in nature consists of elementary processes which, by virtue of a natural 
law, cannot be analyzed. The emission or absorption of light and the 
scattering of a photon by an electron are examples of elementary 
processes which resist any attempt to analyze them. We do not know, 
and if quantum mechanics is correct we shall never know, what happens 
in an atom during the process that leads to the production or annihila- 
tion of a photon. As a result, we cannot apply the principle of 
causality to the process, and it appears to us as a discontinuity in the 
course of events. Hence there is an atomicity in nature, not only for 
matter but for events as well, and Planck’s constant h has the sig- 
nificance that it determines the size of an atomic event, if by size we 
mean the quantity of action involved in an elementary process. 

The above leads us to conclude that the uncertainty relations arise 
from the fact that, since any observation consists of an indeterminate 
interaction between object and measuring apparatus, the object is 
left in a state which escapes our control to an extent given by the con- 
stant h. Therefore it is impossible for us to determine the exact 
state of asystem. As an example, consider a particle the z coordinate 
of which is to be measured. As will be seen, we are confronted with the 
following situation: if a measurement made at time ¢ gives the value of 
x with an uncertainty Az, meaning that x lies somewhere between the 
limits z and x + Az, the simultaneous value of the momentum p, of 
the particle cannot be determined more accurately than with a possi- 
ble deviation Ap,, which is related to Ar by the expression Ap, = 
h/Az. Thus, the mare accurately we measure x, the less accurate 
becomes the measurement of pz, and vice versa. This follows from 
the fact that with every measurement is associated an effect on the 
system which cannot be predicted, because it lies outside the domain of 
causality due to Planck’s constant h. Thus, whereas classical physics 
has assumed that all observables of a system can be measured simul- 
taneously to any desired degree of accuracy, there is in fact a limit to 
this accuracy. On principle, it is impossible to ascertain the exact 
initial state of a system, and consequently it must also be impossible 
to infer unique values of the observables at time ¢ from measurements 
made at f. An exact prediction of the future is impossible without 
exact knowledge of the present. Therefore quantum mechanics 
refuses to ascribe a physical meaning to equations that refer to an 
exactly measured initial state, there being no experiment to which such 
equations can be applied. The question, adequate to the possibilities 
of observation, is not what will develop from an exactly known state, 
but what will occur if the initial state is not exactly defined. Evi- 


Sxc. 2] HEISENBERG’S UNCERTAINTY RELATIONS 3 


dently there is no unique answer to this question, for the state, being 
indeterminate, offers a variety of possibilities. Only a prediction 
based on probability can be made. To work out this prediction we 
must start always at the same indeterminate initial state and then 
measure the state after a time ¢. The result will be a statistical one 
from which we can determine the probability that a measurement on 
the system at time ¢ will furnish a given value. 

Thus only probability relations exist between present and future, a 
fact that implies the essentially statistical nature of the new mechanics. 
The problem then is to determine how to formulate these relations in a 
general set of equations that will supply us with a description of atomic 
phenomena. As will be seen, this problem is solved satisfactorily by 
quantum mechanics. 

2. Heisenberg’s Uncertainty Relations. Coordinate and Mo- 
mentum. Asa beginning, it will be proved that the state of a system 
can be determined only within certain limits of accuracy even with 
ideal apparatus. For simplicity we shall consider first a single particle, 
for example an electron. Let the problem be to determine the posi- 
tion and momentum of the particle at a given instant of time ¢. Let 
us assume that from a previous measurement we know the velocity 
with which the particle has been moving up to the instant ¢ We 
assume also that the velocity is directed along the xz axis and denote 
the corresponding momentum by p. All we wish to determine here is 
the position of the particle at time ¢. Following Heisenberg’s method, 
we could use a microscope for this purpose, but we prefer to use a 
simple pinhole camera, proceeding as follows. Let the opaque screen 
S have a small circular aperture of radius r. At time ¢ let the particle 
be at a distance a from S. Direct a beam of light of wavelength » 
along the x axis to illuminate the particle. Of the light scattered by 
the particle, let us observe that which passes through the aperture. 
This light will produce a diffraction pattern on a screen S’, the bright 
central spot having a radius \b/r, where 6 is the distance between the 
two screens. Under ideal observation, this spot is the region within 
which the screen S’ is hit by the light quanta. Now let us suppose 
that it is a single light quantum which hits the particle at the instant ¢ 
and is reflected in the direction of the aperture. The position of the 
particle now can be located by the projection of the point of incidence 
of the quantum through the aperture, which is assumed to be infinitely 
small, onto the plane in which the particle lies. If the quantum passes 
through the aperture without deflection, the incidence on 8’ will be the 
center of the central diffraction circle and the projection will be on the 
particle itself. In general, however, the quantum may strike the circle 


4 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


at a distance anywhere up to \b/r from center. This means that the 
projection onto the particle plane may be at points up to (Ab/r)(a/b) = 
da/r from the particle’s position. Thus the observer can determine 
the x and y coordinates of the particle’s position with possible errors Az 
and Ay each equal to \a/r. By using light of sufficiently short wave- 
length this error can be made as small as desired. 

Now let us consider how the momentum is affected by the act of 
observation. The original momentum of the quantum was hy/c. 


Fie. 1. 


The reflection of the quantum leaves the amount of momentum nearly 
the same, the small reduction due to the Compton effect being negligi- 
ble. However, the direction is changed in such a way that after 
reflection it deviates from the z axis by a small angle 6. This devia- 
tion must be S a, where a is the semi-angle subtended at the particle 
by the aperture; it is defined by sin a = r/a. Specifying the direction 
of the quantum after reflection from the particle by the polar angles 
9 and ¢ and denoting the momentum of the particle after reflection by 
p, the components of which are p,, py, pz, we can write, from the 
principle of the conservation of momentum, 


pipe Eos Moan colsaetgstags 
c c 
0 =~ sin osin 6 + py 


If 6 and ¢ were known, the components of momentum p, and Py could 


Sxc. 2] HEISENBERG’S UNCERTAINTY RELATIONS 5 


be calculated from the above equations. Actually, however, ¢ is 
unknown and the most that can be said about @ is that it must be 
smaller than a. Therefore we can maintain only that the first terms 
on the right-hand sides of the equations cannot exceed hy sin a/c = 
(h/d)(r/a). Thus the components of momentum p, and py, after 
the measurement of position are given by P + hy/c and 0 respectively, 
the possible error being Ap, = Apy = (h/d)(r/a). 

Thus it turns out that we can make the error of position as small as 
desirable by using sufficiently short wavelength, but the more accu- 
rately we determine the position the less we know about the momentum 
of the particle after the observation. Conversely, if we proceed to 
make Ap, as small as desirable, as, from the expression above, we can 
by using sufficiently long wavelength, we increase the position error 
Az. Thus an exact simultaneous measurement of coordinates and 
momenta proves to be impossible. Heisenberg’s uncertainty principle 
expresses this fact by the fundamental relations 


ArAp,2=h AyApy2h AzAp,Zh (1) 


The equality sign is used in these relations if the interaction between 
measurement and state has been checked as far as possible. Other- 
wise the greater-than sign is employed. 

In what we have been discussing above, the velocity was assumed to 
be known. We must investigate now how we can measure the velocity 
with which the particle was moving up to the instant ¢, at which time 
the position was determined. For this purpose we must compare the 
position P of the particle at time ¢ with its position P’ at time t’ < ¢, 
known from a previous measurement. The desired velocity then is 
given in magnitude and direction by (P — P’)/(¢—@#). By making 
the interval ¢ — ¢’ sufficiently great we can determine the velocity to 
any desired accuracy, since the errors in the determinations of P and 
P’ are outweighed by the great value of the denominator ¢ — ¢’. 

It might be supposed that this procedure ultimately would provide a 
method for determining the exact simultaneous values of coordinate 
and momentum in contradiction to the uncertainty relations. Actu- 
ally the particle no longer moves with the velocity (P — P’)/(t — t’), 
however, for, at the instant ¢ when the position of P was ascertained, 
the velocity was changed by the process of measurement, and it is 
this changed motion on which the future of the system depends. 

For some purposes it is useful to formulate relations (1) as follows: 
the observed state of a system cannot be represented in the phase 
space by a definite point but, rather, corresponds to a space element. 
The dimensions of an element depends on the special kind of experi- 


6 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


ment by means of which the measurement is made. For a moving 
particle, these would be given by Az Ay Az Ap, Ap, Ap,. Under no 
circumstance can it be less than h* and, if the system in question has 
n degrees of freedom, it can be no smaller than h”. 

3. Heisenberg’s Uncertainty Relations. Method of Doppler 
Effect. On the basis of the preceding discussion, we are not yet sure 
whether relations (1) hold for all methods of measuring coordinates and 
momenta since, a priori, we cannot exclude the possibility of a pro- 
cedure furnishing a more accurate measurement of the quantities 
than the method of the pinhole camera. Actually, though, no one has 
found a method, at least up to the present, that invalidates Heisen- 
berg’s relations. Whatever method we adopt to observe the state of a 
system, a precise measurement of the simultaneous values of two 
conjugate observables always turns out to be impossible. For exam- 
ple, we might think of measuring the velocity of a particle by means of 
the Doppler effect in the following way. Let us assume that the 
position Zo, yo, Zo of a particle at time ¢ is known, and that from this 
point the particle is moving with an unknown velocity the components 
of which are vz, vy, v,. In order to determine v;, a light wave of fre- 
quency v is caused to collide with the particle in the direction of the 
x axis. Let the energy of the light wave be that of a quantum hv». 
Now let us measure the frequency »’ of the light which is reflected by 
the particle in the negative x direction. From the theory of the 
Doppler effect, the frequencies » and y’ are related by the equation 


Solving this for v, gives 


2v 


Now, to obtain the real velocity of the particle, we also must consider 
the recoil of the particle due to the reflection of the light quantum. 
This involves a change in momentum of 2hyv/c, and therefore v; is 
increased by 2hvy/mc, where m is the mass of the particle. Thus, 
assuming that we are able to measure the frequencies vy and »’ with 
sufficient accuracy, we could represent the exact value of v, by the 
equation 
, 
ee v : is eck oe 
2y 


Now v and y’ can be measured, but the following situation introduces a 


Sec. 3] HEISENBERG’S UNCERTAINTY RELATIONS 7 


difficulty. Light can be produced only in wave trains of a finite 
length which require a finite time At to pass a given point in space. 
A finite wave train is never monochromatic—as will be pointed out in 
Section 6—but is always resolvable into a multitude of partial waves 
by a Fourier analysis. These partial waves superpose in such a way 
that they interfere destructively outside the train. The frequencies of 
the harmonic components of the train spread over a certain interval 
Av, which is wider the shorter the wave train, the spectral determinacy 
being the less, the broader the interval Av. It will be demonstrated 
in Section 6 that Av At ~ 1, where At is the interval of time required 
by the wave train to pass a fixed point in space. 

On principle, therefore, we can determine the velocity v, of a particle 
~ at time to + At to any desired degree of accuracy by using a sufficiently 
long wave train. For then the spectral indeterminacy Av = 1/At 
becomes very small, and the same holds for the uncertainty Av, since 
this is represented by 
payne Av ied 2h Av 

2Qv 


Since the second term is very small relative to the first, Av, can be 
represented by c/2v At. But, the greater the interval At, the less 
accurately can we locate the particle at to + At, because we are 
ignorant of the instant at which the recoil of the particle occurs. If 
the light quantum is reflected at the beginning of At, that is, at to, 
the position of the particle at to + At will be 


- 2h 
raat (Soo + at 
2v mc 
But, if the quantum is reflected at the instant t) + At, 
dpiag, ads eae Fhe 1 


Qv 


Thus we only can know the position with the possible error Az = 
(2hv/mc) At, and again we obtain 


2hy me 
Az Ap, = Pai at) (sn) =h 


Thus it seems that relations (1) are independent of the choice of 
both the experiment and the coordinates. Accordingly we propose 
the following theorem as fundamental to the new mechanics, 


8 WAVE MECHANICS OF A SINGLE PARTICLE [Ca. 1 


There is no experiment by means of which two canonically conjugate 
observables can be measured with a greater accuracy than is foreseen 
by relations (1). 


Should such an experiment be discovered unexpectedly, all that fol- 
lows would be wrong. It is not very likely, however, that the pro- 
posed theorem will be disproved because, if we go to the very root of 
the uncertainty relations, we find that they originate in the fact that 
the course of all elementary processes evades our control. The 
method of the pinhole camera does not permit an exact measurement 
because we are unable to say in what direction the light quantum will 
be deflected. The Doppler effect method fails because we do not 
know the instant of time at which the reflection of the quantum will 
occur. The question then is whether we shall be able some day to 
predict the course of elementary reactions. Shall we ever, for exam- 
ple, learn how to direct a light quantum so that it will be reflected by 
the struck particle exactly in a given direction? Should this turn out 
to be possible in the future, the supposition of the quantum theory, 
that, on principle, the elementary processes under investigation can- 
not be analyzed, would be incorrect and a new theory that could 
dispense with the constant h would have to be sought. The facts, 
however, undoubtedly favor the quantum theory, and therefore the 
current viewpoint is that the processes governed by h are not subject 
in any way to a space-time description. They appear to our observa- . 
tion as discontinuities in the course of events and cannot be interpreted 
on the basis of causality. The question whether they are, for that 
reason, acausal or whether their causal character only escapes our 
limited powers of observation is, for the physicist, meaningless because, 
if the supposition that the elementary processes cannot be analyzed is 
true, then there is no experiment by means of which the question can 
be answered. The only thing the physicist can maintain is that there 
is no possibility for him to apply the principle of causality to ele- 
mentary micro processes as he experiences them, and to him experi- 
ence alone counts. If when causality fails to explain an event we 
call it chance, then we may say that in all elementary reactions chance, 
or something we cannot distinguish from it, is at work. 

4. The New Mechanics and the Principle of Causality. It 
turns out, then, that in quantum physics a concept, namely that of 
chance, plays a role for which, taken in its full significance, there has 
been no room in the methods of classical physics. Although it is 
true that this concept has been employed because of its usefulness in 
surveying events that are composed of a great number of non-controll- 


Sxc. 4] PRINCIPLE OF CAUSALITY 9 


able single processes, it was never assumed that chance actually, any- 
where and at any time, could interfere with the course of events. In 
contrast to this, quantum mechanics considers it an established fact 
that in the observable world something is at work which can be 
designated only as chance. Does this mean, therefore, that we must 
deny the principle of causality in the physical world? 

Before answering this question let us adopt the principle that the 
physicist must be concerned only with the things he can observe. 
Accordingly a question is meaningless to him if there is no possibility 
of answering it on the basis of experiment. Therefore, when we ask 
him his opinion of causality, it is necessary first to formulate the 
principle so that he can test it by experiment. A formulation that 
fulfills this requirement is: in an isolated system, an identical initial 
state always leads to an identical sequence of later states. This is a 
statement that is proved by observation to be either true or false, 
and the physicist has only the choice of accepting or denying. Actu- 
ally he has to deny it, for when he performs a great number of experi- 
ments on a system, starting always from the same initial state—the 
word ‘‘same” meaning as far as the possibilities of observation are 
concerned—he finds the system changing every time in a different 
way. It is true that the objection could be offered that this does not 
disprove causality because the initial states only seem to be identical 
to us whereas in reality they may differ one from the other. Such an 
argument, however, leads to metaphysics. It is not the business of 
the physicist to speculate on what would be if this or that condition, 
which is incapable of fulfillment, were realized. The physicist con- 
cerns himself only with what can be observed and therefore is com- 
pelled to consider two states as identical if no experiment can prove 
them to be different. Therefore he is not in a position to agree with 
the statement by which we have defined causality. 

Thus the physicist is not interested in the question whether the 
world “‘in itself,’ that is, as it exists independently of our observa- 
tions, may be governed by causality in the defined sense of the word. 
He is concerned only with the world as it presents itself to his observa- 
tion, and he feels that chance enters into certain aspects of it. 

We do not mean by this that there is nothing but anarchy in nature. 
We deny only the possibility of predicting in a unique way the future 
behavior of a system from its present state. As we have seen, such 
predictions cannot be made and there remains only the need to explain 
how classical physics can reconcile its determinacy of events with this 
fact. The explanation is that, when we measure the position of a 
body, the Compton effect, owing to the smallness of Planck’s constant 


10 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


h, cannot change the velocity by a measurable amount unless the mass 
of the body is extremely small. We have seen from relations (1) 
that the uncertainty Av, of vz, which is due to the Compton effect, 
is given by Av, = h/mAz. From this it is clear that, for a body with 
great mass, chance has too small a scope to have a noticeable effect. 
The effect is of importance for extremely light particles, which are so 
sensitive that they withstand observation. For instance, consider an 
electron. In order to determine its position within an atom we have 
to strive for an accuracy of at least 10~* em. Hence for the velocity 
this means an uncertainty 


6.62 X 10777 


Av, 2 (10-2) (10-8) ~ 108 cm/sec 


On the other hand, if we consider a small macroscopic body of 1 gram 
mass, an accuracy of the order 10~* cm in determining its position 
is quite sufficient. The uncertainty here is 


—27 
Av, = fart Oc ~ 10~** em/sec 

10~? 
a value which lies far below the limit of any measurability. Chance is 
at work in the macro as well as in the micro world, but it is important 
only in micro world. The situation would be quite different if 
the magnitude of h were great, for then chance would prevail in the 
macro world also and it would be difficult to recognize any law as 
valid because of it. For example, a body that is moving subject to 
no force would change its velocity every time we observed it, and thus 
it would be impossible to conceive the law of inertia in this seeming 
disorder. 

However, it must not be concluded that quantum mechanics main- 
tains complete indeterminism. Such a view is out of the question 
because it excludes the possibility of physics altogether. If there were 
nothing but chance, nothing could be said about the future. In 
reality, besides chance, there are also laws, and the doctrine of quantum 
mechanics is not that causality does not exist but only that it must 
not be interpreted as in classical mechanics. We cannot conclude 
exactly the future of a system from a given initial state because the 
data of this state do not suffice. They do, however, suffice for deter- 
mining the probability of finding the system in a given state at a future 
time. It is in this statistical sense of future prediction that quantum 
mechanics expresses the principle of causality. It should not be 
maintained that we do not know anything about the future but rather 


Src. 5] DE BROGLIE WAVES 11 


that in many respects we do not know it exactly. The qualification 
“fn many respects” is necessary because in quantum mechanics there 
are certain observables also the values of which can be predicted 
exactly for the future if the initial values are known exactly. The 
principle of the conservation of energy holds in the new mechanics as 
it does in the old. If we know that a system has an energy EF at ¢ = 0, 
we may be sure that measurements of energy made later will furnish 
the same value for Z if the system has not been disturbed in the mean- 
time. Deferring until later a precise statement of causality from the 
viewpoint of quantum mechanics, it is sufficient for our present pur- 
poses to express it as follows: a given initial state does not determine 
the future development of a system, but it does determine the proba- 
bility that the development will take a given course. 

5. de Broglie Waves. Let us consider the problem of a free par- 
ticle and investigate how quantum mechanics attempts to solve it. 
From a measurement at éo, it is concluded that the particle lies within 
the space element Az Ay Az and at that instant has momentum com- 
ponents within Ap,, Ap,, Apz. At t > to, another very precise meas- 
urement is made. What is the probability of finding the particle at a 
given point zyz at this time? The question is reasonable, for it can 
be decided experimentally. In many experiments we start from the 
same initial conditions and determine the exact position of the particle 
at time ¢t However, we must not presume that such experiments 
would verify classical mechanics. If this were so, we would have to 
apply the methods of classical statistics to determine the desired 
probability. But this would not explain how it is possible for elec- 
trons to be diffracted when they pass through a crystal. We may take 
this phenomenon as proof that the principles of classical mechanics 
are approximations that may be used for macroscopic systems and do 
not hold for bodies of extremely small mass. In this search for the 
principles of a new mechanics which would hold for microscopic bodies, 
Louis de Broglie proposed a solution of the problem. His theory may 
be characterized as an attempt to interpret a mechanical event by 
means of a wave representation. Two facts suggested this idea to 
de Broglie. The first was the fundamental relation, H = hv, whereina 
certain frequency v, being associated with the energy, points to some 
connection between a mechanical event and wave motion. The 
second fact was that in the Hamiltonian form of classical mechanics 
such an interpretation had already been used. 

To carry out the plan of a wave mechanics, de Broglie associated a 
harmonic wave motion with every moving particle, the frequency » of 
this wave motion being that given in the energy relation EH = hv. 


12 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


This association does not mean that the motion of the particle is 
actually that of a wave. Whether such an interpretation is plausible 
will be discussed in Chapter 6. For the present discussion the wave is 
used as a figurative description of a certain function of coordinate and 
time by means of which we can learn about the behavior of the particle. 
Thus, since the wave has a purely symbolic significance, there need 
be no question about the medium of propagation. 

To clarify further the idea of wave mechanics, let us consider a 
particle moving relative to a coordinate system K in the direction of 
the x axis with a momentum which, we assume, has been measured 
accurately. According to relations (1), we cannot know anything 
about the position of the particle since an infinitely small Av, introduces 
an infinitely great uncertainty Av. The total energy of the particle, 


kinetic as well as rest energy, is given by E = mc?/V1 — 8”, and its 


momentum by p = mv/V/1 — B®, where m is the rest mass of the 
particle and 8 = v/c. If, on the other hand, we refer the motion not 
to K but to a coordinate system Ky in which the particle is at rest, its 
momentum vanishes and its energy reduces to Ey = mc*. Now 
imagine that the whole space is filled with a fictitious medium, and 
associate a vibration of this medium with the motion of the particle. 
Assume that all the particles of the medium vibrate with a frequency 
vo relative to Ko, where vp = Eo/h, and that all the particles are in 
the same phase of vibration. Then for an observer in the Ko system 
this motion may be described by a cos 2rvotp. But to an observer in 
the K system, relative to which the particle is moving with a velocity 
v, the motion of the medium is that of a wave in the x direction. The 
Lorentz transformation for the transition from Ko to K is 


7 (t — va/c?) 
be agli eps 


and thus the motion of the medium in K can be represented by 


vx 


pitelde lesStdhab ny odtaad on & ’ 
a cos ob Ag? Gan givin tomas mv\t—T (2) 


where v = vo/V 1 — 8”, c?/v = V. Thus there is observed in the 
system K a wave whose frequency is 


ee a 3 
Vie-fk AVI— Be. 8) 


Sze. 5] DE BROGLIE WAVES 13 


and a velocity of propagation V which is related to the velocity v of 
the particle by 


Vo = c? (4) 
The wavelength is 
V 
h=— 
v 
Therefore from (4) 
e 
ow 


But p = vE/c? = vhv/c?; therefore 
A=- (5) 
The relations 
and A=- (6) 
hold for any coordinate system, for, in transforming from reference 
system K to a new system K’ relative to which the particle moves with 
a velocity v’ in the x direction, we obtain »’ = E’/h and ’ = h/p’. 
Thus in any coordinate system the wave can be represented by 
2 
a cos + (Et — px) 
or, on letting h/2r = h, 
1 
@ cos 5 (Et — pz) (7) 
However, no relation exists between the wave and the positions of the 
particle during its motion because a clear-cut measurement of momen- 
tum excludes the knowledge of position. 
Suppose that the particle is moving in the direction aSy and with a 


momentum the components of which are pz, py, pz. The wave expres- 
sion (7) must be modified to read 


1 
a cos = [Et — (pet + pyy + p.z)] (8) 


The components of momentum may be expressed as 


Ps= op Py=8P Pe = iP 
But p = h/); therefore 


~i=> 


14 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


Now, since V = c?/v with v < c, the de Broglie waves are propagated 

with a velocity that can never be less than the velocity of light, and 
thus they cannot be real waves but are purely symbolic in character. 

That the velocity V depends on the frequency can be demonstrated 

by 
net 
hvo = mc and y= —————— 
vi — 3 


Therefore vo/vy = V1 — B®; 8B = V1 — v0?/v”. Therefore, since V = 


c?/v and B = v/c, we have 


c 

VO Vi — lr si 
And, if n is the index of refraction c/V, we obtain 

n=V1-— vo?/v" (10) 


6. The Method of Wave Packets. Since the de Broglie waves are 
symbolic only, their usefulness might be questioned. To answer 
this question, let us refer once more to our original problem, namely 
the probability of locating a particle at a given point and given 
time, assuming that initially the particle was within the element 
Ax Ay Az Apr Apy Ap:. It is the aim of wave mechanics to determine 
this probability by making use of the de Broglie wave motion, which 
can be associated in a certain way with the observed initial state of 
the particle. Obviously a single wave of definite frequency and direc- 
tion will not be sufficient because the momentum of the particle is 
known only with an uncertainty Ap, Ap,Ap,. Hence it becomes 
necessary to represent the initial state by an ensemble of waves so 
that for each vector p within Ap, Ap, Ap, there is a corresponding 
wave. When the components pz, py, pz are replaced by &, n, £, the 
expression for the wave associated with the momentum becomes 


a(é, ” feet Gttyrtepl (11) 


This expression can represent the wave by either its real or its imagi- 
nary part. Here both the amplitude and the phase may depend on 
t, , ¢. By writing a similar expression for every vector p within the 
uncertainty element we can represent the state by 


u(xyzt) = LS om iceatac cd iiigis dé dn dt (12) 


the integration being extended over the whole element. The ensemble 
of waves represented by (12) is called a wave packet. 


Sec. 6] THE METHOD OF WAVE PACKETS 15 


In order that equation (12) may have physical meaning in describ- 
ing the given state, the amplitude function a(¢) must be chosen so 
that there is a definite relation between the wave group and the space 
element Az Ay Az in which the particle initially is situated. Such 
a relation can be established by making use of the following theorem 
due to Rayleigh: if in an ensemble of waves of the form (11) the 
parameters £, 7, ¢ vary continuously within certain intervals A¢ An AS, 
the amplitude function a(£n{) can be chosen always in such a way that 
the total intensity of the waves is greater than zero only within a space 
the dimensions of which are 

h h h 
Ar2 AE Ay = a Az = At 
and outside of this space the waves interfere destructively; that is, 
the intensity there is zero. 

A rigorous proof of this theorem will not be given here. That it is 
plausible will be clear from the following example. Consider two 
parallel waves of infinite extension, the waves being characterized by 
Eyénis1 and E2£onof2 respectively. Let us assume that the waves 
are in phase at time f at a point which we shall choose as origin of 
coordinates. As the waves move away from the origin in the direction 
of the « axis, a difference in phase continuously develops so that at a 
certain distance x this difference will be x and hence xf, = LEo + h/2. 
Therefore at this point there is destructive interference. As the prop- 
agation along x continues, we pass through a series of maxima and 
minima depending on whether x(£; — £) is an even or odd multiple 
of h/2. Now add to these two waves a group of wave motions the 
parameters of which vary continuously between those of the two 
given ones. Let the whole ensemble be in phase at the origin at time 
to. The excitation caused by the waves is increased between x = 0 
and x = h/(2(é, — £»)], but all outside maxima disappear because the 
waves meet there in all possible phases. It appears then that a group 
of waves in which & varies within the limits ¢ and &) = AE can be 
superposed so that it has a non-vanishing intensity only within a space 
Az = h/A&é. This excitation cannot be confined to a smaller space, 
but, by introducing suitable phase differences at the origin, we can 
make Ar > h/Aé. Since the same reasoning applies to the y and z 
directions, it follows that the behavior of the wave group is in accord 
with Rayleigh’s theorem. 

In addition, we can prove that the interval of time At, during which 
positive excitation occurs at a certain point, is related to an energy 
interval EH, — Ez = AE by the relation At AE = h. For, if we start 


16 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


out with the same two waves which are in phase at the origin and then 
plot the oscillation against time, we observe a sequence of maxima 
and minima, the first minima being obtained when Eyt = Hot + h/2. 
During the interval ¢ = 0 and ¢ = +h/[2(Z, — E2)], the effect of the 
additional waves, the energies of which lie between Z, and Ee, will be 
one of excitation with all outside maxima removed. Thus the excita- 
tion caused by the wave group lasts for an interval At which is related to 
AE by At = h/AE. Since hy = E, we obtain the relation At Av 2 1. 

Let us apply the theorem to the wave group represented by (12) 
and try to find a relation between this group and the uncertainties, 
Az, Ay, Az. We must superpose the waves in such a way that they 
destroy each other outside the element Ax Ay Az, giving a total intensity 
only within this element. We have seen that this is possible only 
when the conditions Ar AE = h, Ay An = h, Az Af 2 A are satisfied. 
This is precisely what is defined by the uncertainty relations (1). 
By virtue of the above, the following theorem can be stated: 


We always can associate a wave packet with the observed initial state 
of a particle. The spectral composition of the packet describes our 
knowledge of the momentum of the particle, whereas the manner of tts 
superposition describes our knowledge of the position. 


We are able now to give an answer to the question of the probability 
of locating the position of the particle. Leaving the wave packet 
which symbolizes the initial state of the particle undisturbed, we 
determine the distribution of the waves at time ¢. Then the proba- 
bility in question for any point in space is given by the square of the 
resultant amplitude. This statement brings out the essential idea of 
the statistical nature of wave mechanics as it is applied to a free 
particle. It is not to be considered a purely formal statement but 
rather a real one which can be confirmed or disproved by experiments. 
We shall refer to these experiments later. 

When we do not disturb the waves, the phase differences at a certain 
point will change because of difference in frequencies. This being 
the case, the resultant amplitude will change. The interference 
phenomena produced by the waves keep changing, and wave mechanics 
uses this fact to establish a relationship between the motion of the 
particle and the propagation of the waves. Originally the approach 
to the problem was to identify directly the wave packet with a cor- 
puscle, an idea chiefly proposed by Schroedinger. Such an inter- 
pretation would be practicable if the wave packet remained intact 
during the motion (that is, the space within which the intensity is 
greater than zero remains unchanged in size). Apart from an excep- 
tion discussed below, this space does change, because after a certain 


Src. 7] WAVE AND CLASSICAL MECHANICS 17 


time the wave packet gradually loosens and spreads out in all direc- 
tions, thus making the position of the particle more and more uncertain. 
7. Reconciliation of Wave and Classical Mechanics. Here we 
consider a macroscopic body, wishing to find out how wave mechanics 
can furnish us with information of future events when this informa- 
tion is, if not exactly precise, at least exact enough for all practical 
purposes. It is quite certain that by applying classical mechanics to 
this problem we can determine the position of the particle uniquely at 
time ¢ from its state of motion at f. Actually, in their application to 
macroscopic bodies, the new and old mechanics lead to the same 
results. Consider, for example, the motion of a small macroscopic 
particle. In determining the position of the center of the particle at 
time ¢, there will be a possible error of some fractions of a millimeter in 
the measurement. When compared with h this is an extremely great 
inaccuracy, but it is compensated for by the possibility of a very 
accurate measurement of the momentum. Thus the observed state 
can be represented by a wave packet which occupies a macroscopic 
space Ax Ay Az of nearly point dimensions and which corresponds to 
the very small element Af An Af of the momentum space between 
Eonofo and £) + Ag, no + An, fo + Ag. In equation (12) replace 
én, ¢ by i 
eee iy ae aint pide egy! (13) 


These new variables are very small quantities. The limits of integra- 
tion will be 0 and Aé, An, Af. Making these substitutions, we obtain 


u(ayet) = [ [ f a(e'n's’)x 


aL e+ Ge ESE) ar + CSE oF Jee 29 — vinta) —serrt 19} a, dn! de! 


Eo is the energy corresponding to the momentum £onofo, and (dF /d£)o, 
(0E/dn)0, (AE /8f)o are the derivatives of E relative to £, n, ¢ at point 
£0, no, fo. Since &’, n’, ¢’ are small quantities, it is sufficient to keep 
only the first-order terms in the expansion of Z. Placing outside the 
integral sign those factors not subject to the integration, we obtain 


u(ayzt) = gilBot—(etetunetsto) ‘sei eat te tz e+ ee hae’ dn’ dt! 


from which complex expression the real part is to be used. Thus, if 
we let Ae“ represent the complex integral in the above equation, both 
A and A being real, u becomes 


u = A cos {; [Hot — (xto + yno + 2f0)] + a| (14) 


18 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


where A is the amplitude of the motion and A? represents the proba- 
bility of locating the particle at time ¢ at point xyz. 
To determine A? we find the product of the two conjugate integrals 


Ae* if oat L GE) sre e+ b ae’ dn! dk’ 
Ae = fpf oul toe ]F* ie } dé’ dn’ d¢’ 


In each of the integrals on the right it will be observed that , y, z, and 
¢t appear only in the terms (0H/d£) ot — 2, (dE /An) ot — y, (QE /AS) ot — 2. 
The same must be true for the product of the two, which gives us ats 
Thus we can conclude that the resultant amplitude will remain unal- 
tered when 2, y, z, and ¢ change in such a way that 


EE 
(2) t — x = constant (=) t — y = constant 
0 dn/ 0 


oF 
—) t—z = constant 
0¢/0 


From this we see that the amplitudes are propagated in straight lines 
with a velocity the components of which are (8E/8£)o, (9E/dn)o, 
(8E/df)o. Evidently the relative space distribution of the amplitudes 
is not altered by this displacement, and at every instant of time the 
wave packet maintains a constant size and a form which is nearly 
point-shaped. This packet moves through space with a velocity c, 
the components of which are given above. So, in the case being con- 
sidered, the position of the particle can be predicted for every instant 
of time with certainty. Thus, when wave mechanics is applied to 
macroscopic systems, it loses its characteristic uncertainty, which is its 
distinguishing mark when microscopic systems are involved. Of 
greater import is the fact that the new and old mechanics are in com- 
plete agreement on proving that the velocity c of the packet is identical 
with the velocity v of the particle corresponding to the packet. The 
energy E is given by E = (¢? + 9° + £°)/2m. Then 


(15) 


Similarly 


(28), ci 
ra Mig Oye an rg Mie O55 


Thus c and »v are identical. 


Src. 7] WAVE AND CLASSICAL MECHANICS 19 


This identity of motion of the particle and the packet for a time 
inspired the belief that the particle and the packet were identical. 
Such an interpretation is, of course, incorrect because the packet 
cannot be the particle. It must be considered solely as a description 
of the probability function from which can be drawn certain knowledge 
about the future behavior of the particle. We have already noted 
how the packet soon begins to spread outward, a fact that would make 
the identification of the particle and the packet impracticable. Even 
for a macroscopic particle the stability of the packet is a limited one, 
* for in the expansion of EZ above we have retained only first-order 
terms. If we had retained terms of higher order in the expansion we 
would have had 


0E =) te’ 
(2) 0+(S A: minerals 


Only as long as ¢ is not too great can the second- and higher-order 
terms be considered negligible. But if ¢ becomes sufficiently great 
these terms become of importance, and the packet ceases to maintain 
its size and form. To show this let us subdivide the element Aé An At 
of the momentum space into elements dé’, dy’, dt’ which are small 
enough to consider ¢’, y’, ¢’ constant within the element. The whole 
packet then is made up of parts each of which contains the momentum 


of an element dé’ dn’ dt’. Thus A will now be expressed by > As 
where by A; we mean the integral 


ff [ral GD GENES mele hae aide 


the integration being extended over the ith element dé’ dy’ dt’. From 
this it is seen that this particular sub-packet moves through space 
according to the equation 


dE *) 
2), é+ ae oe +: x = constant 


with corresponding forms for y and z. Hence the different ampli- 
tudes move with different velocities depending on é’, n’, ¢’. Conse- 


quently the sum of these amplitudes ) A; as a function of xyz repre- 


sents something that is changing in size and form, and as a result the 
wave packet will dissolve. The smaller we take £'n’t’, that is, the 
more accurately the momentum of the particle is defined, the later will 
this dissolution occur, for, according to the equations above, the 
relative displacement of two amplitudes A; and A, will be (°E/d£*)o 


20 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


(é,’ — Em’)(t/2) +--+. This will be smaller the smaller we take 
¢’. But, no matter how small we choose ¢’, if we let ¢ assume suffi- 
ciently great values, this relative displacement increases and dissolu- 
tion sets in. Therefore the connection between the old and the new 
mechanics can be expressed as follows: 


In the application of wave mechanics to macroscopic systems pre- 
dictions can be made which for all practical purposes do not differ 
from those of classical mechanics provided that a certain magnitude 
of time is considered, which magnitude will depend on the accuracy 
with which the momentum has been measured. 


In order to describe the connection between the propagation of the 
wave and the motion of the particle, the authors of wave mechanics 
frequently have used the concept of group velocity, which deals with 
the displacement of a point in space at which all the waves of a given 
group are in phase. Thus the group velocity is identical with the 
velocity with which a wave packet moves through space. Therefore 
the components of the group velocity are (0H/d&)o, (dE /dn)o, and 
(dE /df)o. Replacing EF by hy and p by h/d, 


Now p? = # + 72+ ¢°; therefore dp/d§ = §/p = a, the direction 
cosine of p relative to the z axis. Therefore 


OH dy OE dv oF dy 


a df a ad/) a d(i/r)* 
from which it follows that the group velocity is dvy/d(1/d). Generally 
this velocity differs from V, the velocity with which the waves are 
propagated and which is called the phase velocity. The two velocities 
will be identical only if V is independent of v. In that case we obtain 
d(1/x) = d(v/V) = dv/V, and thus dv/d(1/A) = V. 

Finally, it is well to make clear the following point. In Section 4 
we saw that wave mechanics does not subscribe to a strict causality 
of events, a viewpoint that should not be misconstrued as complete 
indeterminism. Now, however, we can interpret the point of view 
more precisely. The answer to the question about the future of a 
mechanical system can be a definite or an indefinite one, depending on 
what we wish to know. The answer is an indefinite one if knowledge 
of the exact state of a system at a future time tis desired. As we have 
seen this cannot be predicted precisely because the initial state is 
not precisely known. On the other hand, if we seek only the proba- 


(16) 


Sxc. 8] A PARTICLE MOVING IN A FIELD OF FORCE 21 


bility of finding the system in a given configuration at that time, a 
definite answer can be given because this probability is determined 
uniquely by the propagation of a wave packet. Thus we assert that 
mechanical events can take a determinate or an indeterminate course 
depending on whether the concepts of corpuscular or wave mechanics are 
used for their description. 

8. The Wave Mechanics of a Particle Moving in a Field of 
Force. Up to this point we have considered a particle subject to no 
force. We now consider a particle moving in a field of force in which a 
potential function V(xyz) exists. To apply the method of wave 
mechanics to such a particle, it seems that there must be some relation 
between the wave motion and this potential function, otherwise there 
could be no connection between the waves and the motion of the 
particle. Consideration of the case brings up the idea of waves in a 
heterogeneous medium, this heterogeneity being defined by the func- 
tion V(xyz). However, it was not easy to develop the idea. As we 
know now, the credit for overcoming this difficulty belongs to Schroed- 
inger. de Broglie had pointed out that there was a significant connec- 
tion between the principles of Maupertuis and Fermat. Following 
this suggestion, Schroedinger undertook the further development of 
the mechanical-optical relations which had been known to Hamilton 
and might now be used for the purposes of wave mechanics. 

To make Schroedinger’s idea clear, let us confine ourselves to 
classical mechanics first. Consider a function S representing the 
action of a particle that moves in a field of force. S is defined by the 


integral | (pz dx + p, dy + p,dz), which is taken over the path between 


the space points P(xoyozo) and P(xyz). According to classical mechan- 
ics the motion starting at Po is determined uniquely by zoyozo and 
Zoyozo. Thus S can be considered a function of these quantities 
and time or, for brevity, of go, do, and t. The coordinates of P are 
also functions of go, go, and ¢. Furthermore the energy E of the parti- 
cle can be calculated from go and go, this energy remaining constant 
throughout the motion. Therefore S can be given as a function of 
xoyozovyzH. Thus, when a particle moves with an energy E by suitable 


projection from Po to P, S gives the action ‘ (pz du + py dy + p, dz) 
as a function of Py, P; and EZ. 


From the way in which S has been defined, we see that the vector 
momentum p is the gradient of S, for the components of p are 


as as as 


= — = 1 
Py ay Ps “5 (17) 


22 "WAVE MECHANICS OF A SINGLE PARTICLE (Cu. 1 


Now the energy of the particle is 
1 
5p, (Pe + Py? + Bs’) + V(ay2) = E 


Introducing the components of the gradient, we obtain 


eS (ey + 2) + ()'| +V(ty)=E (18) 


Thus S satisfies the fundamental differential equation of Hamilton. 
From this equation can be derived the whole of the possible motions of 
a particle. Now assume that we are able to solve the equation by 
means of a function S(xyz) which contains as parameters—aside from . 
an unimportant additive constant—besides Z two constants a and 
B so that S = S(xyzEo8). The solution then associates with every 
point in space a certain vector p = grad S. The composition of these 
vectors results in a twice-infinite manifold of curves each one of which 
represents a possible path of the particle. If we are interested in a 
particular motion with the initial state gogo, then the constants 
E, a, B of the solution have to be chosen so that for ro = 2, yo = Y, 
zo = z, grad S = p. The path is then given by that member of the 
manifold which passes through 2oyozo. According to Jacobi every 
motion of the particle can be described by 
= a1 oo bi = tte (19) 
where aj, 61, and fo are three new constants. Since two equations in 
xyz determine a curve, the first two equations of (19) select a certain 
curve from the manifold. The third equation fixes the position of the 
particle at time /. 
Let us compare an optical problem with the mechanical one by 
investigating the propagation of light in a heterogeneous medium. 


This propagation is defined by the wave equation 
au, a'u . du 1 d*u 
—4+-54+5 =33 20 
ax” * ay? t dz” sv At” nt 


where »v is the velocity of light and is a function of zyz. Assume a 
solution of the form 


u = a(zyz) sin 2x[vt — ko(ayz)] (21) 


where a(ayz) is the unknown amplitude and wu represents the excitation 
produced by the light. The significance of the function ¢ is that, for 


Src. 8] A PARTICLE MOVING IN A FIELD OF FORCE 23 


every surface represented by ¢ = constant, such a surface is one of 
constant phase. With the above value of u, equation (20) becomes 


—, — 4°k?a ey sin 2r(vt — k¢) 
“Fe] 
ms Ne yee +a ay a 5) cos 2x(vt — kd) = 0 


This equation is satisfied only if the factors of the sine and cosine van- 
ish. From this result the equations 


2 2 2 
#y (%) Hatin) Gt (22) 
Ox v 4ar“a Ox 
dadd  aXid°o ' 
eee A ee eh ) 
Now assume that the wavelength ) is sufficiently small so that a(xyz) 
may be considered nearly constant within a distance comparable in 
magnitude to. A sufficiently short wavelength undergoes no notice- 
able diffraction, and under this condition the postulates of geometric 
optics apply. Therefore equation (22) must permit a corresponding 
simplification. 
To determine the wavelength at a given point we must find the 
distance measured along a normal / to the surface ¢ = constant, which 


will introduce a change of 27 into the argument of the sine. Thus 
kX(A¢/dl) = 1, or, on defining the direction of the normal / by a, 8, 7, 


— 


Now 
ae a¢ a¢ 


= 


24 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


On the assumption that (da/dxz)\ < a, on differentiation we obtain 


or 
aa se ( 2) da 
—~«K—(1 -=) += 
Ox" > Ox Ox ox 
Multiply both sides by A, 


aa da 
se x ein 
Be See 


Finally we can write 


1\‘ 0% _ 1 a¢\* 
s<h-- LE 
a L4 dx? + ? dz. 
Thus we observe that, when ) is sufficiently small, the last term in 
equation (22) is negligible and the equation takes the simpler form 


dG) Be a9) 


which agrees in form with equation (18). It appears then that the 
phase function ¢ in the domain of geometric optics satisfies Hamilton’s 
differential equation and becomes identical with it if we set v°/kv? 
= 2m(E —V). Thus the same differential equation 


» (24) = 2m(E — V) (24) 


can account for the motion of a particle of energy E in a field of force 
V(ayz) according to classical mechanics as well as for the propagation 
of light in a heterogeneous medium described optically by the relation 


v(xyz) = ik Vim —V) (25) 


Equation (24) has, therefore, a mechanical as well as an optical inter- 
pretation. For the mechanical case a complete integral, $(ryzHag), 
has the significance of the action function from which a twice-infinite 
manifold of paths can be derived with grad ¢ giving for all points in 
space the direction of the momentum with which the particle passes 
the point. In the optical case the same solution determines the phase 
surfaces of the wave motion, and the components of grad ¢ are pro- 
portional to the direction cosines of the light beam at a given point P. 


Sxc. 9] THE GEOMETRICAL METHOD OF WAVE MECHANICS 25 


Thus, with every motion of the particle in the field V(xyz), a beam of 
light in a heterogeneous medium can be associated so that the beam 
takes the same way as the particle. 

In order that equation (25) may have a physical meaning, we must 
assign to the constant k the dimension of a reciprocal action because 
v/[k V 2m(E — V)] must have the dimension of a velocity. Thus 
arises the idea of a relationship between k and h, and we shall see that 
it is possible to develop a consistent theory on the assumption that 
k =1/h. Equation (25) then takes the form 


hy 
v(ayz) = Vam(i — V) (26) 


And so it turns out that this close connection between the classical 
motion of a particle and a beam of light in a heterogeneous medium 
conforms with the aims of wave mechanics, although at first it is 
purely formal in character, for that which we wish is the representa- 
tion of the state Ax Ay Az Af An Af by a wave group the propagation of 
which will supply us with information about the future behavior of 
the particle. We can achieve this aim by associating with every point 


xyzén€ within the element Ax - - - Af a certain solution of the wave 
equation 
4x? y? 2 
Vu + u = Vu + 5 (E — Vou = 0 (27) 


where E is the energy corresponding to the point and » = E/h. If 
the wave under consideration satisfies the assumptions of geometric 
optics, there is no doubt about the procedure. We have seen that 
equation (27) can have a solution of the form 


a(ayz) sin 2x[vt — ko(xyz)] 


wherein the function ¢ satisfies Hamilton’s equation (24). Accord- 
ingly we shall identify @ with the action function S(ryzEHaf) specified 
by the point zyzént. The amplitude a can be calculated from equa- 
tion (22’), ¢ being replaced by S. This procedure is permitted only if 
d is sufficiently small so that a(xyz) is nearly constant within an 
interval of the same order of magnitude as \. If this condition is 
satisfied, we can use a wave mechanics method which, although of 
only a limited validity, proves very valuable in many applications. 

9. The Geometrical Method of Wave Mechanics. The purpose 
of this section is to describe, with the help of a wave packet, the initial 
state of a particle according to the plan discussed in the preceding 


26 WAVE MECHANICS OF A SINGLE PARTICLE © (Cu. 1 
section. To accomplish this let us consider first all the points ryzint 
lying within the element Az - - - Af to which the same function 
S(ayzHaB) applies. We mean by this that the equations 

Be rp lo Mlasmpnity Cl 

Ox oy dz 


give the same values of a, 8, and £ for all the points. Then a wave 
can be associated with all these points; this wave will be given by the 
real part of the expression 


a ( xyz) e/tlet—s (xyzEaB)] (28) 


The frequency of the wave is defined, as before, by » = E/h, where E 
comprises the rest energy moc” as well as the kinetic and potential 
energy. The complex amplitude a(xyz) is assumed to be a solution of 
equation (22’) if we substitute S(ayzHaf) for ¢. Thus the amplitude 
will be dependent also on the variables Z, a, and 8 and will be expressed 
by a(ayzHaB). Because of the homogeneity of equation (22’) in 
a(xyz), a complex factor remains arbitrary. We are then permitted to 
multiply the amplitude of the wave by any real number and to add an 
arbitrary phase constant to the argument of the cosine, a possibility 
that is important in the construction of the wave packet. 

The wave packet is now formed of waves represented by equation 
(28) in such a way that, if the parameters H, a, 8 are varied within 
suitable intervals AZ, Aa, and A@, all points in the element Az - - - Ag 
are considered. Hence is obtained a manifold of waves which are 
represented by 


u(ayzt) = | | | a(xyzEaB)e!™*t-SveFB))] GF da dB (29) 


the integration being extended over the intervals AZ, Aa, AB. How- 
ever, the above equation is deficient still because with each a(xyz) 
there is an arbitrary factor ye®. To remedy this we must try to select 
the factors ye” so as to superpose the waves with such amplitudes 
and in such phases as to have all of them related to the space element 
Az Ay Az. This can be done by making use of the uncertainty rela- 
tion (1) which make it possible to superpose the waves so that at time ¢ 
the resultant intensity differs from zero only within the limits Az, Ay, 
Az. As noted above, this can be accomplished by a suitable choice of 
amplitudes and phases. Again it should be pointed out that the pro- 
cedure is to leave the constructed wave packet undisturbed. Then the 
intensity at a point P at time ¢ can be taken as a measure of the proba- 
bility of locating the particle at that time and position. 


Src. 9] THE GEOMETRICAL METHOD OF WAVE MECHANICS = 27 


It can be proved easily that, when this method is applied to a particle 
upon which no force is acting, the result agrees with that of Section 6. 
For in this case the action function S of such a particle is given by 


S= f (Ede +ndy +5 a) 
= te + ny + be 


In this expression, £, n, and ¢ are to be taken as parameters, any two 
of which, say £ and y, may be identified with the constants a and £. 
Then the relations between EZ, a, 8 and &, n, ¢ can be expressed by 


anthony $3 


a= B=n E a 


+ constant 

For a(xyz) in equation (29), with ¢ = S, the solution a = constant 
follows, it being possible for the constant to depend on &, n, ¢ Equa- 
tion (29) can be transformed by substituting, in the integral, £, », and ¢ 
for the variables Z, a, and 8. This gives 


u(ayet) = ff f a(eneM* ett ae dn de (80) 


which is equation (12). 

It should be clear at once that the different velocities with which the 
members of the ensemble represented by (29) travel through space 
must, before long, dissolve the initially sharply defined packet. The 
points at which the waves are in phase agreement will begin to separate 
and diverge outward in space. Thus the uncertainty of the initial 
state gives rise to an increasing uncertainty in future states. Only 
in the case of a particle of sufficiently large mass can wave mechanics 
agree with classical mechanics and supply information that is, for all 
practical purposes, certain. Here the probability function described 
by the wave packet is confined to a very small space at any time, so 
that the prediction assumes the character of certainty and the pre- 
dicted position agrees with that which results from the initial state 
according to classical calculation. In other words, for large mass, 
the requirement is that the wave group remain intact and carry out a 
motion corresponding to the laws of classical mechanics. 

To show this let us consider a particle the momentum of which has 
been measured with great accuracy. For a macroscopic body this 
measurement does not affect the accurate measurement of posi- 
tion. The initial state is represented by a group of waves of the form 
ae“) in which the constants H, a, and 8 vary so slightly that 


their values are confined to the immediate neighborhood of certain 


28° WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


values, Ho, ao, and Bo. This limitation arises in the case under discus- 
sion because EZ, a, and 6 at a point depend on the components of 
momentum, hence the variation will be small. Thus if HZ, a, and 6 
become Eo + E’, ap + a’, Bo + 6’, then LZ’, a’, and 6’ lie between zero 
and AE, Aa, and AB. Therefore we have for S 


S(ayz, Eo + E’, ao + a’, Bo + B’) 


: BY ge 29) os (8) 

= S(ayeBoaeBo) + (2), fat (2), oe Ts ad 
where the subscript 0 of the derivatives means that the operation is to 
be performed at the point Zoao8o. On the other hand, in the case of 
the amplitude a, which depends on £, a, 8, these constants can be 
considered the constants Eo, ao, Bo, that is a = a(xyzHoaoBo). The 
same thing cannot be permitted in the case of S because it appears in 
the wave expression together with the factor 1/h. With these changes 
the equation for u becomes 


. 0 "t— 0 98 ¥ aS a’ as 
u(xyzt) oe [ff alcopetl auseiien io ad [ s+( ze he ° +(33 et 
dE da! dp’ 
= stlibadelhese et ae i {(+- (3), 1#-($),«“-@} 
dE’ da’ dp 


where So = S(xyzHoao8o). Finally, setting the triple integral equal 
to A, 


aa | f[oll- Gol - GG)" ae da as (31) 


we have 
u(xy2t) = ade t—S0) (32) 


from which we see that |aA| represents the resultant amplitude. Thus 
the probability is given by |a?A?| which, for any time ¢, furnishes the 
probability of finding the particle at P. 

Let us consider first the factor A of the probability function. From 
the way in which the wave packet has been formed, it follows that, at 
time ¢t, A differs from zero within the element Ax Ay Az, vanishing else- 
where. Now, according to equation (31), A depends on a, y, z, and 
t only in the terms t — (0S/AE)o, (AS/da)o, and (0S/0B)o. Hence, if 
x, y, and z change with time in such a way that 


t— (=) = constant (2) = constant (%) = constant 
OE 0 da/ 9 0p 0 


(33) 


Sc. 9} THE GEOMETRICAL METHOD OF WAVE MECHANICS 29 


the amplitude must remain the same. And so the amplitude, at any 
time, differs from zero only at those points which satisfy relation (33). 
In other words, if at some point Po(xoyozo) of the space Ax Ay Az the 
functions ¢ — (0S/dE)o, (AS/da)o, and (08/08) equal a, b, and c 
respectively at ¢o, then in an interval of time ¢ — to the amplitude 
shifts to a point P, determined by 


-@,-+ @-» > 
any, da/o ae 


On comparing these equations with those of (19), we find them to be 
identical with Jacobi’s solution of the mechanical problem. They 
describe the motion carried out by a point mass according to classical 
mechanics, the motion starting at point Po and the initial momentum 
being given by grad S(xyzHoao8o) taken at Po. 

In this case there is no difficulty in following the motion of the wave 
packet. Let us imagine that all space is filled with a fluid the mole- 
cules of which are moving with a momentum p = grad S(ryzHpao8o) 
at every point xyz. The behavior of the wave packet is described by 
the motion of that part of the fluid which at the instant ¢ lies within 
the space Ax Ay Az. Consequently, since the corresponding fluid 
moves within a certain stream tube, the wave packet holds together. 
Even though there are small changes in volume due to variations in 
cross section of tube, the volume remains extremely small in a macro- 
scopic sense. And thus, to a particle moving in space according to 
the laws of classical mechanics, the wave packet will assign at each 
instant of time a definite position, the term ‘“‘definite’’ being used 
in the macroscopic sense. From this we see that when heavy particles 
are involved the old and new mechanics are in full agreement. 

It should be pointed out that the limitation mentioned in Section 7 
applies here as well. Hence, if time assumes a sufficiently great value, 
the wave packet will begin to lose its stability. For, if the terms of 
higher order are retained in the development of S, the wave packet 
defined by equation (29) disperses into parts moving with different 
velocities, causing the region of phase agreement to become more and 
more indefinite. We see then that classical mechanics is true only to 
a certain approximation which is closer the more accurately the 
momentum of the particle has been measured. 

It must be emphasized again that all the waves in the packet defined 
by equation (29) are supposed to have a sufficiently short wavelength. 
Only on this assumption can we apply the laws of geometric optics to 
the propagation of waves. As we have seen, the wavelength is 
inversely proportional to the momentum of the particle, and therefore 


30 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


the geometric method is adequate for the treatment of a rapidly mov- 
ing particle provided that its momentum satisfies the condition that 
v(xyz) = hvy/V 2m(E — V) does not change noticeably within the 
distance \ = h/p. 

What must the procedure be if the above supposition is not fulfilled? 
In the theory of the atom this is precisely the situation in which we are 
interested. Under these circumstances we cannot neglect the last 
terms in equation (22’), and thus there is removed from consideration 
any connection with geometric optics and therefore with classical 
mechanics which, up to this point, we could have maintained. The 
only alternative is to consider the waves that are to be associated with 
the particle by the wave equation 


Aor? p? 
Vu + u=0 


in which hvy/V 2m(E — V) is to replace v(ryz). In other words, we 
must return to Schroedinger’s equation 


Viu +5 (H — Vu =0 


and attempt to find a rigorous solution of that equation. This will be 
considered in detail in the following Chapter 2. 

10. The Scattering of Probability Waves by a Nucleus. The 
geometric method may be used in dealing with the deflection of a 
particles by a nucleus if we limit ourselves to those deflections wherein 
the particle does not come too close to the nucleus. Consider an 
alpha particle moving in the direction of the axis, a nucleus of charge 
Ze being at the origin. In the experiments of Rutherford the velocity 
of the particle was 2 X 10° cm sec. This corresponds to a de Broglie 
wavelength, \ = h/p = (6.62 X 107°") /(2 X 10° X 6.6 X 107"), which 
is almost 5 X 107'* em, or about the linear dimension of the nucleus. 
Thus, if we disregard the immediate neighborhood in which the strongest 
deflections occur, the Coulomb field of the nucleus may be considered 
nearly constant within a range of the order A. Hence we can treat at 
least the smaller deflections by the method of the preceding section. 

Let us assume that the velocity with which the particle approaches 
the nucleus was measured with great accuracy. Then according to 
Heisenberg’s relations (1) our knowledge of the initial position is 
very inaccurate. All we know is that at time to the particle was some- 
where in a large space Az Ay Az around the z axis. Accordingly the 
initial state must be represented by a probability packet filling this 


Sxc. 10] SCATTERING BY A NUCLEUS 31 


space uniformly. The problem then is to determine the history of this 
packet during the interval ¢ — to. 

For this purpose we make use of the theorem proved in the preceding 
section. That theorem states that, if the initial momentum of the 
particle is known very accurately, the amplitude factor A of a point 
P» will be at the same point P after the interval ¢ — to as a parti- 
cle which starts from Pp» at time to, with the given initial velocity 
and subject to classical mechanics, would be at time ¢. To determine 
the classical path of such a particle we proceed as follows. Let the 
initial velocity be v, and 6 the initial distance of the particle from the z 
axis. Then the angular momentum relative to O at this instant is 
mbv. When the particle passes a point M in its path, its angular 
momentum is given by —mr?(d@/dt), where @ is the angle between OM 
and the x axis. Since angular momentum is conserved, we have 
' —r*(d6/dt) = bv. Furthermore, if v, is the y component of the velocity 
at M, then, from Coulomb’s law, 


Ze? 
mY = 2-5 sin 0 
oe 
w b dt 


which on integrating gives (the initial conditions v, = 0, @ = m being 
kept in mind) 


2Ze* 
mvy = “- (cos 0 + 1) (34) 


When the particle has passed the domain within which the nucleus 
exerts a noticeable repulsion, it resumes its original velocity v along a 
straight line. If a is the angle between this straight line and the x 
axis—a depends on b—we have vy = v sina and @=a. Equation 
(34) then becomes 


2Ze* 
mo sin a = —— (cos a + 1) 


Therefore 
2Ze* (cosa+i1) 2Ze* a 
6 = * = es 
mv sina mv" fe 2 65) 
Equation (35) determines the angle a through which a particle, inci- 
dent at a distance b from the axis, will be deflected by the nucleus. 
Let us determine now the probability of finding the particle at a 
point P from measurements made at time ¢t. This probability is given 
by the intensity C? = |aA|? of the wave packet at that time and 


32 WAVE MECHANICS OF A SINGLE PARTICLE (Cu. 1 


position. This intensity is compared with the intensity Co? = 1, 
the value at the point Po at time ty. Since the factor A is transferred 
without change from Po to P, then C? and C’ differ only on account of 
the factor |a(xyz)|*. If the value of this factor at Po and to is |ao|?, 
then Co” = |ao|A* and therefore C? = |a|A? becomes |a?/ao"|._ To 
determine the ratio |a?/ao| we make use of equation (22)’, according to 


which 
O¢ 0a aio, 
y) 2222 + 20% =o 


In this equation the action function S must be substituted for ¢. 
Therefore grad ¢ is proportional to the classical velocity vy with which 
the particle is moving. Thus we get 


| a 
>,» 8 + Saivy =0 
On multiplying the above equation by a we obtain 


BO" ol oh 
ers div v = div a*v = 0 (36) 
Hence the vector a’v is solenoidal, and this enables us to determine 
|a?/ao"| in the following way. At point Po an annular ring of radii b 
and b + db is described around the z axis. If we follow the stream 
tube that issues from this ring, we find that when it passes through the 
field of the nucleus its walls are no longer parallel but diverge at an 
angle da. The relation between this angle and db is found, from equa- 


tion (35), to be 
ae d a 
2 = re 
db -( da CO 5 de 


Thus a cross section of the tube at point P will be a ring of radius r sin a 
and breadth rda. The area of this ring is 2rr? sin a da, whereas that 
at point Po has an area 2rb db = + db”. As the particle has a velocity 
v at the two points and as div a*v = 0, 
ao’r db® = a?2rr* sin ada 
Therefore 
a em Cc? < eames. sa 
ao” 2r? sin a da 
ay 1 dcot? a/2 
mv? / 2r?sina da 


a ( Ze? y e atthe 3 (37) 


rmv*/ sin‘ a/2 


Sc. 10] SCATTERING BY A NUCLEUS 33 


C? evidently may be considered as the probability of the alpha particle 
being deviated from its original direction by the angle a; it corre- 
sponds to Rutherford’s deduction from classical mechanics. 

We might take this agreement as a matter of course, since we have 
made ample use of classical mechanics in our calculations. Such 
would be misunderstanding the result, for the calculation is based 
essentially on wave equation (27), and the use of classical mechanics 
has been possible only because of the close connection between the two 
mechanics when the de Broglie wavelength is sufficiently short. 

For the alpha particles this condition is fulfilled because their great 
mass has a diminishing effect on the wavelength. For electrons, how- 
ever, the method fails because, as they have a mass about 7000 times 
smaller than the alpha particles, their wavelength is of the order of 
10-* cm. The assumptions of geometric optics hold only at distances 
from the nucleus which are far greater than 107° cm. At such dis- 
tances the force field is very weak, and geometric methods are inade- 
quate for explaining the observed deflections of the electrons. For 
the treatment of such problems we must determine, with the help of 
equation (27), the change which a plain wave undergoes when it meets 
an atomic scatterer such as a nucleus, atom, or molecule. The calcu- 
lations of this change follow a procedure developed by Born on the 
basis of perturbation theory (Chapter 4). 

The deflection of particles by a charged nucleus is interpreted in the 
new mechanics as a scattering process involving probability waves. 
The originally plane de Broglie waves that are incident in a given 
direction are scattered by the nucleus in all directions. If we dis- 
regard the region immediately around the nucleus, the scattering takes 
place in a manner quite similar to the penetration of the earth’s atmos- 
phere by light. This is observed at once from equation (26) for the 
phase velocity: 

E 


an NN CT Reg J mm 


in which V(ayz) is the field potential and Z the constant energy of the 
particle. At a great distance from the nucleus V = 0, and therefore 


v = VE/2m. In the neighborhood of the nucleus, V is greater or less 
than zero, depending on whether the charge on the particle is positive 
or negative. For small values of r, v is greater than vp when the charge 
is positive but less than vo when it is negative. If we consider the 
ratio v9/v(ayz), which depends only on r, as the index of refraction of a 
hypothetical medium, then for a positive charge it becomes less than 
unity as the nucleus is approached, but for a negative charge it becomes 


34 WAVE MECHANICS OF A SINGLE PARTICLE [Cu. 1 


greater than unity. Thus probability waves are affected by a nucleus 
as if the nucleus were surrounded by an atmosphere consisting of 
spherical shells which become optically denser in the outward direc- 
tion for a positive corpuscle but optically thinner for a negative one. 
When the rays penetrate this atmosphere, they are refracted from and 
toward the normal respectively and thus leave the domain of the 
nucleus in all directions. A wave packet that passes through the 
field of a nucleus must, therefore, begin to disperse. 


PROBLEMS 


1. Verify relations (1) by discussing the accuracy Ax with which the position of a 
particle can be determined with the aid of a microscope. 

2. By representing the motion of a particle by a de Broglie wave, show that the 
diffraction of the wave by a slit of width Az is in agreement with the uncertainty 
relation. 

3. A beam of monochromatic light illuminates a diaphragm A containing two 
slits. A diffraction pattern then appears on a screen B set behind A. Imagine 
that the experiment is carried out with but one photon, and show that, owing to 
the uncertainty relations, the diffraction pattern is destroyed by any attempt to 
determine the slit through which the photon passed. (Set up an indicator behind 
each slit, and take into account the fact that the reaction of the indicator must not 
shift the photon from a maximum of the diffraction pattern to a neighboring 
minimum.) 

4. Show that the expression dv/[d(1/d)] for the group velocity can be put in the 
form V — (dv/dd), where v is the phase velocity. 

5. How must the phase velocity v be chosen as a function of » in order that the 
group velocity become inversely proportional to v?? 

6. Apply the equation V7u + (2m/h)?(E — V)u =0 to the case wherein the 
potential at a certain surface varies discontinuously, and show by using the integral 
theorem of Gauss that the component of grad u, normal to the surface, behaves 
continuously. The same holds for the function w itself. | 

7. When a plane wave u = aye! falls on a surface at which V is discontinuous, 
the wave is partly reflected (age***) and partly refracted (ase**"). Find the 
relations between the normal and tangential components of ki, ke, and kg. 

8. A plane wave ae*'¥ is moving in the direction of the z axis against a potential 
wall, V = constant, extending from x = 0toz =a. The wall V is assumed to be 
so high as to be impenetrable for the particle according to classical mechanics. 
Show that the boundary conditions can be fulfilled only if there exists for x < 0 
a reflected wave aze**¥, and for z >a a wave a;e"*F which has penetrated 
the wall (‘‘tunnel effect’). 


2 
WAVE MECHANICS 
OF STATIONARY STATES 


11. Schroedinger’s Wave Equation. In the preceding chapter 
wave mechanics was developed on the supposition that the field of 
force does not change noticeably within a distance of the order of 
magnitude \. Under this condition the application of the principles 
of geometric optics to the propagation of probability waves permits 
the maintenance of a close connection with classical mechanics, the 
formalism of geometric optics being identical with the treatment of 
mechanics according to the Hamilton-Jacobi method. 

We lose this connection when we try to apply the methods of wave 
mechanics to an electron bound to an atom. The field of force acting 
on such an electron varies so rapidly in the immediate neighborhood 
of the nucleus that the application of geometric optics is out of the 
question even when the wavelength is extremely short. For example, 
let us consider the electron of a H atom. According to Bohr’s theory, 
in the ground state of this atom the electron moves with a momentum 
p = mv = 2nme*/h in a circular orbit of radius a; = h?/4x?me’. 
Should we for the moment disregard all doubts as to the admissibility 
of this conception, there would remain an essential difficulty. Should 
it be true that there actually are states of the electron that fall within 
the above-mentioned values of momentum and radius, there would 
be no way in which to describe the motion by using classical mechanics, 
for the de Broglie wavelength associated with this motion would be 
\ = h/p = h?/2rme” = 2ra;. Within a space of these dimensions, 
at a distance a, from the nucleus, the potential of the Coulomb field 
can by no means be considered even approximately constant, and thus 
the methods of geometric optics no longer apply. 

Here we must make use of the rigorous solutions of Schroedinger’s 
wave equation 


Vu + “ (E —V)u=0 (27) 


As we have seen, the purpose of wave mechanics is to represent the 
35 


36 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


motion of a particle starting from a given initial state Ag Ap by a sum 
of solutions of (27). In Section 9 this plan was used successfully by 
combining into a wave packet approximate solutions of the form 


a(xyzEap) e/Mlet—s (zyzBap)) (38) 


which satisfy equation (27) when the potential is nearly constant 
within a distance of the order of magnitude \. In this way we obtained 
a probability function u(ayzt) by which the behavior of the particle 
can be described correctly. 

As approximations are permissible no longer, our procedure must 
be to seek rigorous solutions of equation (27) and construct linear 
aggregates from these which will represent the analogue to the wave 
packet. A priori, we cannot say whether this procedure will have 
meaning. At any rate it is worth-while to investigate whether the 
coefficients of the aggregate can be connected with the measurements 
so that the sum will have the same significance as the function u, 
which represented a wave packet. 

In attempting the integration of equation (27), we immediately 
discover a most striking peculiarity of that equation. It turns out 
that a differential equation of that type can be solved by a regular 
function ¥(ayz), that is, one that is everywhere finite, continuous, and 
single-valued, only if the energy parameter E of the equation has a 
value belonging to a definite discrete set,Ho,E,, °°: ,En,**°*. The 
values of this set depend on the nature of the function V (xyz) and are 
called eigenvalues of the differential equation. The corresponding 
solutions, Wo, ¥1, °° * ,Wn,***, are called eigenfunctions. These 
functions are uniquely defined, that is, they contain no arbitrary con- 
stants aside from an arbitrary factor by which they can be multiplied 
by virtue of the homogeneity of the equation. 

Thus the attempt to apply Schroedinger’s equation to an atom 
defined by a given potential function V starts out with the promising 
discovery that there exists a set of energy values which have a distinct 
significance relative to the atom. They are, as we shall see, identical 
with the energies belonging to the various states of the atom according 
to Bohr’s theory and are connected with the frequencies of the emitted 
spectrum by the relation hy = EF; — Ey. It is a decisive success that 
wave mechanics, of itself, is able to provide the critical energy values 
without supplementary assumptions. However, for the present we 
are interested in another point. Our intention is to combine the solu- 
tions of equation (27) with indefinite coefficients with a linear aggre- 
gate, taking into account the time ¢ in such a way that, following the 


Sxc. 12] THE EXPERIMENTAL POSSIBILITIES 37 


wave packet method, we multiply every eigenfunction by the factor 
e*™#t| Then, instead of equation (29), we shall have the function 


u(xyzt) = >, cabie (39) 


The function wu is represented now by a sum because the eigenfunctions 
y; form an infinite discrete} set, whereas in the continuous solutions 
(38) uw was represented by an integral. 

It is a question now of how to choose the coefficients c; in the sum- 
mation (39) so that u may be regarded as a probability function which 
will give information about the position of the electron. Obviously 
this choice can be made only on the basis of certain experiments, 
and therefore we are bound to undertake a careful examination of the 
experimental possibilities. 

12. The Experimental Possibilities. First of all, we are inter- 
ested in experiments that enable us to measure the energy of an atom. 
If we want the measurement to be exact, experiments that involve the 
coordinates and momentum of the electron are of no use. Only if 
the exact simultaneous values of g and p could be measured would the 
exact determination of the energy E = V(zxyz) + p*/2m be possible. 
Hence the only appropriate experiments for our purpose are those 
which deal directly with the energy of the atom. Of special impor- 
tance are the experiments of Franck and Hertz dealing with the energy 
transfer in the collision of an atom with an electron. They have the 
advantage that the energy of the colliding electron can be measured 
exactly both before and after the collision; hence the energy which the 
atom acquires because of the collision can be determined. It is true 
that we do not gain any information about the absolute energy of 
the atom in this way. What we do learn from the measurements is 
that this energy is always changed by an amount AZ, corresponding 
exactly with the difference H; — E; of two eigenvalues. From the 
measured AH we are able to infer the energy levels between which 
the passage takes place. To do this we must find those levels of the 
set of values the difference of which just equals AE. Thus the collision 
method informs us about the energy belonging to the atom both before 
and after the measurement. 

Now suppose that the energy of a given atom is known. Will we 
be able to find out further details about the motion of the electron in 
the given stationary state? Evidently not, because when we try to 
examine the motion we are bound to perturb the state by that act of 
observation in such a way that the state is changed into another. For 


+ For the present we assume a set of discrete eigenvalues. The case of a con- 
tinuous spectrum is discussed in Section 36. 


38 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


example, a single measurement of the position is sufficient to throw the 
electron out of its orbit. The reason for this failure is that the energy 
and phase of the rotational motion are conjugate quantities and as 
such cannot be measured simultaneously. An exact knowledge of 
the energy is possible only if we renounce any knowledge of the phase. 
From our point of view, therefore, we cannot agree with the old atomic 
theory in which a certain motion was correlated with the electron in 
every stationary state. Such a conception is not in accordance 
with the principle that theory must not make use of concepts that have 
no observational meaning. On principle, we cannot know the motion 
the electron is undergoing when the atom is in a given stationary 
state, and therefore the question concerning the motion cannot be 
included in a correct theory. 

This does not mean, however, that a stationary state is definable 
only with respect to its energy. We can analyze the state more ade- 
quately if we confine ourselves to questions that can be answered on 
the basis of experiments. Such a question is the following: Let us 
assume that we have a great number of atoms of the same kind and 
under the same conditions. We pick samples at random out of the 
assemblage and examine their behavior when colliding with electrons. 
From this behavior we judge the energy levels of the atoms, so that 
from a sufficiently great number of tests a probability that an atom 
is at a given energy level can be inferred. 

First let us consider the case in which the experiments furnish a 
probability of unity for a certain energy Z;. Then we are certain that 
any atom chosen at random from the assemblage is in the ith station- 
ary state, and we imagine that this atom is subjected to a measure- 
ment by which the position of the rotating electron is determined. 
Of course, by such a measurement the state is destroyed, but every 
atom examined provides a datum belonging to the description of the 
state. After carrying out a great number of such measurements we 
are in possession of a statistical ensemble which can give us information 
about the probability of finding the electron at the point xyz relative 
to the nucleus. This ensemble provides the limit of experimental 
possibilities, and we see that, although on principle there is no way of 
knowing the motion of the electron in a given stationary state of the 
atom, we can at least in a mental experiment make sure of all the 
positions corresponding to the state together with the probabilities of 
finding the electron in these positions. 

Our purpose now is to give to the eigenfunctions y; an interpretation 
that is guaranteed by experimental possibilities. The idea that y; 
may be connected with the probabilities defined above suggests itself; 


Sec. 13] STATES OF UNDEFINED ENERGY 39 


also the idea that we should try to establish a theory on the following 
assumption: the square of the modulus of y;(yz), taken as a function of 
zyz, will, when multiplied by dz dy dz, give the relative probability 
that a measurement of position made on an atom in the 7th station- 
ary state will find the electron within the volume element dz dy dz. 
Whether |y,|? really has this meaning cannot be decided by direct 
experiments (feasible only in our imagination); it can be proved only 
by an examination of the consequences. 

Thus the question of how to determine the coefficients c; of the 
summation in equation (39) is to be answered for a special case as 
follows. If we know for certain that the atom has an energy E,, we 
take cy, = 1, with all the others being zero. The state then is repre- 
sented by 


u(xyzt) = We(ayz)e/ (40) 


which is a wave of amplitude y;, and frequency vz = E;/h. The 
interpretation of this wave according to our theory is that the square 
of u, uu*, where u* denotes the conjugate complex to u, gives the 
probability that the electron is to be found at xyz at time t. In the 
case defined by equation (40) the probability is independent of time, 
since ¢ is eliminated in performing the operation wu*. 

13. States of Undefined Energy. Now let us consider the general 
case in which the measurement of energy does not supply the same 
value for all the atoms in the assemblage. Assume that the values 
E,E, - + + are found with the probabilities co’c1” - - - respectively, 
where, for formal reasons, the probabilities are written as squares the 
sum of which equals unity. This case requires careful consideration. 

At first we might imagine that the meaning of the measured statis- 
tics is perfectly clear. If we adopt the viewpoint of the old quantum 
theory wherein the atoms could exist only in the states 0,1,2-°-°, 
we can interpret the statistics only in the sense that there are, in the 
assemblage of N atoms, Neo” atoms having an energy Eo, Nc,” atoms 
with an energy H;, and so on. In short, the assemblage is a mixture 
of different stationary states. 

Now, in a given case it is very easy to decide whether this interpre- 
tation is correct. ‘The method is to refine the statistical investigation 
by removing from the whole assemblage a sufficiently great section 
and ascertain its statistics. The test is this: if the whole assemblage is 
really a mixture, it should be possible, by repeating the above proce- 
dure, to find a partial assemblage which is statistically different from 
the whole. When once we happen to find atoms of the same energy 
E;, only, the statistics will show c, = 1 and all the others zero. 


40 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


If such experiments prove that the assemblage is a mixture, the 
probability of finding the electron of an atom picked at random from 
the assemblage at the point xyz evidently is given by 


>, ovait* (41) 


for the probability that the atom is in the ith stationary state is ¢,’, 
and for this state the xyz probability is y;(xyz)y;*(xyz). If it is a 
single atom existing under given conditions, rather than an assemblage, 
with which we are dealing, expression (47) still applies if an assemblage 
under exactly the same eqpditions proves to be a mixture containing 
the different states in the ratio co?:c": 

Accordingly quantum mechanics does not ‘dety the possibility of 
mixtures. But, contrary to conventional opinion, it maintains that 
not any assemblage may be considered a mixture. It may be that 
there is a statistical identity for any sufficiently large partial assem- 
blage taken from the whole. Then the assemblage can be considered 
a mixture no longer, and we have to assume that all the atoms are in 
the same state, which is quite as well defined as a stationary state in 
the sense of Bohr but with the difference, however, that no definite 
energy value can be assigned to the atom. For the understanding of 
quantum mechanics it is of decisive importance to become familiar 
with this strange idea of a defined state without a defined energy. 
What is meant is not an atom with definite energy which is not known. 
Nor must we think of an atom the energy of which varies with time. 
In fact we must adopt the idea of a state not energetically defined. 
This concept is characteristic of quantum mechanics and cannot be 
understood on the basis of classical mechanics. There is no contra- 
diction between the idea of undefined energy and the fact that we are 
always in a position to ascertain a definite energy value by means of a 
measurement. Since by an act of measurement we destroy the state 
being investigated, the atom is compelled to seek a new energy level, 
that is, to assume a certain stationary state of definite energy. From 
this we can understand the significance of the numbers c; which charac- 
terize the state in question; that is, c;” is the probability of the atom 
favoring the 7th state. 

Let us assume now an assemblage of atoms which is not a mixture 
but which, by statistical analysis, has been proved homogeneous. 
We state that the conditions to which the atoms are subject guarantee 
a pure case. The question in this case is what function u(ryz) gives 
the probability of locating the electron at xyz by the product uu*. 
Quantum mechanics assumes that this function is given by 


Sxc. 13] STATES OF UNDEFINED ENERGY 41 
u(xyzt) = Y, cab s(aye)e“™ (42) 
Hence the probability in question is 
aiag@i ke » cap ject > cht te Mat (43) 
i 


i/; E;—E. 
7 > ciee ante A) (BB, t 
© 


This probability does not agree with that of a mixture. The essen- 
tial difference between equations (41) and (48) is that in (41) all the 
terms are independent of time, whereas in (43) the terms with 7 # k 
are periodic functions of time. -To understand the meaning of this 
difference we must refer to the electron on which no force acts. We 
saw in Section 5 that for such an electron the motion of definite energy 
is described by a de Broglie wave 


= ae Mlet— tt yrted)] 


and, since wu* is independent of both time and coordinates, it follows 
that, when the energy of an electron is known exactly, any position in 
space has the same probability and therefore it becomes impossible to 
localize the particle. But, if we renounce an exact determination of 
energy, we have to substitute a wave packet for the single wave, the 
effect being that now the probability has a maximum in a certain 
domain of space. Then wu* depends on time; that is, the maximum 
is not stationary but moves through space. Thus, if the state loses 
its energetical determinacy, the motion of the particle, in its successive 
phases, becomes visible with a distinctness that increases with decreas- 
ing definition of energy. 

Thus the time-dependence of equation (43) is to be interpreted as 
the way in which the phases of motion of the electron are pictured in 
the propagation of the de Broglie waves. The phase, as a function of 
time, is canonically conjugate to the energy, and hence, according to 
the uncertainty relations, exact simultaneous measurement of these is 
impossible. Any indication of phase must vanish from the wave 
picture when the energy has an exact value; this is the reason why the 
element of time does not appear in equation (41). Only when the 
energy becomes indefinite does phase enter the picture, and then it 
represents a compensation by which the state gains in the definition of 
one entity what it loses in the definition of another. Now we see how 
the concept of the motion of rotation is to be treated. In quantum 
mechanics this concept has a meaning only when, in the definition of 


42 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


the state, the energy recedes in favor of the phase. Under this con- 
dition equation (48) consists of a very great number of eigenfunctions 
superposed in such a way that wu* represents a nearly point-shaped 
domain. It is to be expected that the domain will hold together, at 
least for a certain time, performing during this interval a motion 
according to the laws of classical mechanics. 

Whether quantum mechanics is correct in representing a pure case, 
characterized by given numbers c;, by a function of the type (43), of 
course, can be proved only by experiments. The application of the 
theory to given systems will afford us the opportunity to test it. One 
point, however, must be made clear beforehand. The assumption 
that equation (43), when multiplied by dx dy dz, furnishes the proba- 
bility of the electron being found within dx dy dz at time ¢ can be 
correct only if the integration of equation (43) over the whole space 
has a value that does not change with time, for that integral gives the 
probability that the electron is found anywhere in space and is always 
equal to unity. 

In order to show that this requirement really can be fulfilled, let us 
consider first the time-independent integral 


Jf [ voter dx dy de = [ vebit dv (44) 


The function y; is a solution of equation (27), and, since that equation 
is homogeneous in y¥;, ¥; remains a solution when multiplied by an 
arbitrary factor. This factor is chosen so that the integral in (44) 
has a value of unity. We then call the function y; ‘‘normalized.” 


It can be shown further that all integrals | Ww,* dv vanish for i ¥ k. 
To prove this let us consider the equations 


ae 
oe ee ee 


h2 
er V'yn* + (Ex — Vive* = 
m 


On multiplying the first by ¥;,* and the second by y; and subtracting, 
we get 


h? 
aa (Wi*V'Yi — WV 7Yn*) = (Ex — Eve* 


This equation is integrated over the whole space. The integral on the 
left vanishes according to Green’s theorem, and, since Z; ¥ E;, 


[voit do=0 G#h) 


Src. 14] WAVE MECHANICS AND BOHR’S THEORY 43 


Thus the functions y; are orthogonal. The conditions for normaliza- 
tion and orthogonality are contained in the equation 


/ Vin* dv = di (45) 


5;x having a value unity when 7 = k and a value zero when i # k. 
Now, by using equation (45), we can transform the integral of (43) 


into 
), csca te eae J bolt dv » e 


from which it follows that the integral is independent of time and has 
the value unity since > c;” = 1, this because c;? represents the proba- 


bility of the 7th state. 

14. Wave Mechanics and Bohr’s Theory. It is instructive to 
compare the statistical manner in which wave mechanics describes the 
stationary states of an atom with the theory of Bohr, according to 
which an electron should be performing an orbital motion that can be 
determined on the basis of classical mechanics. In quantum mechan- 
ics we cannot adopt this viewpoint because there is no experiment by 
means of which the motion can be observed. We cannot carry out 
more than one measurement on an atom which is known to be in a 
given stationary state, because this one measurement is sufficient to 
destroy the state completely. The concept “orbit of an electron”’ 
has no meaning, therefore, for a state of definite energy, and thus, 
depending on the experimental possibilities, we must substitute a 
statistical ensemble of the positions in which the electron can be found 
for that concept. 

A priori, it is conceivable that statistics might furnish a probability 
which differs from zero only for points that define a certain orbit. 
However, according to quantum mechanics this is not true. As will 
be seen in applying the theory to an oscillator or to the H atom, the 
values of the functions y; differ from zero everywhere in space except 
for certain “nodal surfaces.” In other words, for any stationary 
state it is possible that, in the determination of its position, the electron 
can be found at any distance from the nucleus. The orbits, which in 
Bohr’s theory play so predominant a role, are indicated in the wave- 
mechanical model of the atom only in so far as the maxima of the 
functions y;(xyz) make recognizable a surface that has a certain 
resemblance to Bohr’s elliptical and circular orbits. Thus we may 
characterize the new theory figuratively by saying that every orbit is 
dissolved into a statistical cloud in which the orbit can be recognized 
only in the form of a diffuse condensation. 


44 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


However, the relationship between wave mechanics and Bohr’s 
atomic theory requires a deeper analysis. The great success of Bohr’s 
theory makes it rather obvious that the application of classical mechan- 
ics to the atom cannot be wholly wrong but to a certain extent must 
be justifiable. As we saw in the preceding section there certainly 
is a connection between the old and the new mechanics which becomes 
conspicuous only when we are dealing with states the energy of which is 
undefined. In order to bring out this connection more clearly, let 
us consider an atom which is known to be in a stationary state of high 
energy. Let us suppose that there is made a measurement which 
furnishes, within the accuracy afforded by the uncertainty relations, 
the position and velocity of the electron at a given moment. If the 
number 7 of the state is sufficiently high, it is most probable that we 
shall find the electron at a great distance from the nucleus since the 
maximum of the function y; moves outward as7 increases. When the 
measurement is completed, the atom, of course, is no longer in the 
ith state because this state has been changed into another of unde- 
fined energy which, according to equation (42), is represented by 


> cee, Now, if we assume that light of a great wavelength 


has been used for the measurement, the state cannot have been changed 
very much, and therefore we can suppose that the sum contains only 
wv, which belong in the neighborhood of y;. On the other hand, the 
initial state, defined within the limits Ax - - - Af, can be represented 
by a wave packet as well, for, because of the great distance from the 
nucleus, the force field acting on the electron varies slowly and hence 
we can apply the methods of geometric optics to the problem. Then 
as a representation we obtain a wave packet defining a function of 
coordinates and time which must be equivalent to the function 


> core 5H, for both of them have the same physical meaning, 


namely, both give the probability of finding the particle at the point 
xyz at time t. We know that in the case of a very accurately meas- 
ured momentum the wave packet contracts nearly to a point the 
motion of which agrees with that of a material point in classical 
mechanics. Thus in quantum mechanics also there are states of the 
atom which can be described approximately by the picture of an 
electron moving in an elliptical or circular orbit, provided, however, 
that the energy of the states is not assumed to be exactly defined. 
From this viewpoint the fact that classical mechanics was able to 
solve the problem of the atom, at least to a certain extent, seems sur- 
prising no longer. The theory was not altogether wrong. It was 
only the application of it to definite energy states that was incorrect. 


Sxc. 15] EXPECTATION VALUES OF MECHANICAL ENTITIES 45 


Luckily, however, this did not prevent the theory from arriving at 
results, for example those concerning energy terms, which were at 
least partly correct. 

15. Expectation Values of Mechanical Entities. From the 
definition of the function u; = y,e“”**, upon differentiation relative 
to ¢ we obtain 


Therefore the wave equation 


i? : 
Ba u; + (E; — V)u; = 0 


can be transformed into 
ae Galo ok Mom wigade (46) 


This equation, which may be called the time equation and no longer 
contains the energy parameter, holds for all w functions simultane- 


ously, and consequently for any linear aggregate > Cu; = dew pra 


in ‘contrast to equation (27), which does not possess this property. 
With Schroedinger, we assume that equation (46) holds also when the 
potential V depends not only on coordinates but on time as well; this 
case has been omitted from our previous considerations. The deriva- 
tion of equation (27) then loses its validity, since it is based on the 
supposition that equation (20) can be solved by an expression of the 
form y(ayz)e'?""*. 

For the present, however, we shall restrict ourselves to a potential V 
which is independent of time, and use equation (46) to determine the 
probable velocity of the electron. Let us write the latter equation in 


the form 
du hi 2 2m 

se sar ~ e a a) 
Replacing u in this equation by its yee complex, we obtain 

du* ss 

ie wo en Cee: oe 7V) ® 
Multiplying (47) by u* and (48) by u and adding, we get 
9! d(uu*) ht 


a (u*V?u — uV?u*) 


Tes ee 


ot 


46 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


If we multiply both sides of this equation by the volume element 
da dy dz = dv, on applying the theorem of Gauss we obtain 


J * ae * eae * 
5g Cu) de = a (u* grad, u — u grad, u*) df 


where the integral on the right is to be extended over the surface of 
the space element considered and the subscript n signifies: the normal 
to the surface outward. The following interpretation can be given to 
the above relation. According to Section 13, uu* dv is the probability 
that at time ¢ the particle can be found within dv. The product uu* 
can change with time only if there is a certain probability of the particle 
entering or leaving the element. From this it can be inferred that the 
vector 


= hit * * 
6.7 oe (u* grad u — wu grad u*) (50) 


must be the probability that the particle will in unit time pass through 
a unit surface normal to the direction of the vector. Equation (49) 
has the exact form of the equation of continuity in hydrodynamics, 


Op : uh 
at + div (pv) = 0 


and with which it agrees also in sense, the only difference being that in 
wave mechanics p = uu* and py = s must not be interpreted as density 
and current but as probabilities. 

The analogy of s with py suggests the assumption that the vector 


uu* 2m Uu u* 


, * 
s s hi (gradu —=—s grad u ) (51) 
represents the probable velocity with which the particle is moving at 
time ¢ at the point zyz. It is true that this interpretation cannot be 
tested directly, because the uncertainty relations do not permit an 
experiment by which the velocity of a particle can be measured at a 
given point. The only thing we can measure is the velocity itself, 
a simultaneous measurement of position being impossible. However, 
we can arrive at a statement about the velocity by the following con- 


Sec. 15] EXPECTATION VALUES OF MECHANICAL ENTITIES 47 


sideration. Since the probability of the particle being in dv is wu*, in 
which case it is moving with a velocity s/wu*, the contribution of dv 
to the velocity is given in magnitude and direction by 


* 
suu ts ee 
UU 
Therefore 
[sa = ne (u* grad u — u grad u*) dv (52) 


is the expectation value of velocity at time t. This value has nothing to 
do with the position of the particle at time ¢ and can be determined 
experimentally by taking a great number of atoms which are under 
identical conditions so that they are described by the same function u. 
We measure the velocity of the electron at time ¢ for every atom and 
from these take the mean value. 

If we multiply equation (52) by m, we get the expectation value of 
the momentum: 


p= of (u* grad wu — u grad u*) dv (53) 


In what follows, expectation values are indicated by a bar over the 
quantity. 

Equation (53) can be simplified by taking into account the fact that 
the integral 


| (u* grad u + u grad u*) dv = il grad (wu*) dv 


= | uu*n df 


where n is a unit vector in the direction of the outward drawn normal, 
tends towards zero for an infinitely remote surface since the proba- 
bility wu* vanishes there. Then, upon multiplying the integral by 
hi/2 and adding it to the equation for p, we obtain 


p = hi / u* grad u dv (54) 
and therefore 


du 
ee . t peeeaent 
pz = ni | u r dv (55) 


If a stationary state of definite energy is given, we have to substitute 
vie%* for u, and, from equation (53), we obtain 


hi sf wave 
p= = (ven He 12H) a 


48 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


from which it follows that #, vanishes when the wave equation (27) 
is solved by real eigenfunctions, since then y; = y,*. 

The expectation values of other observables can be evaluated in the 
same manner as that for p. For example, that for the coordinate x 
is given by 


== | uu*az dv (56) 
or, in general, for an arbitrary function f(xyz), 
f(xyz) = / uu*f (xyz) dv (57) 
Thus the expectation value of the potential is 
V =£,,, = il uu*V (ayz) dv 
and that for K = —grad V is given by 
K = — | uu* grad V dv 


To evaluate Z, let us refer to the interpretation of the coefficients ¢; in 
the expression u = ) ewe“, According to Section 13, the prod- 
uct c;c;* gives the probability of the energy value E,, so that E = 

c;c;*E;. This expression can be transformed. From the equation 


Ou +4 


ye cw Eje/™ BH 


h , du 7: } 
=ut*— = c;*W;*e (i/A) Et cpy,Bye “EH 


k 


Upon integrating this over the whole space and taking relations (45) 
into account, we obtain 


h du 
bd Pr 2 ) wat Tit 
Fi ; u ry dv c;¢;*E; 


so that we get 


Sze. 16] THE PRINCIPLE OF TRANSFORMATION 49 


We now obtain for the expectation value of kinetic energy 


Ey. = E bite Eos = J ( 16" By = Vunt) dv 
a ot 


and therefore, because of equation (46), 
= h? 
Ey. = — | u*V7u dv (58) 
2m 
from which it follows that, since Ey;, = (pz? + py? + p2”)/2m, 
os 2 
pe = —h? | ie de (58’) 
Ox 


Finally let us determine the expectation value of the angular momen- 
tum,d =rxp. We have 


hi 
aE fee ut gradu — u grad u*) dv 


Transforming as for p, we get 


d = hi [ (rx grad u)u* do (59) 
Therefore 
d, = ni | u* (y% _ 2 2) dv (59’) 
dz oy 


For the expectation value of M = r~« K, we find 


M = — [ wrx grad V dv 


As we shall see in Chapter 3, all expectation values are real automati- 
cally. We could show that the expectation values of the various 
entities are connected by the relations of classical mechanics dp/dt = 
K and dd/dt = M. However, we shall derive these relations in a 
more general way in the next chapter. 

16. The Principle of Transformation. In order to discover the 
meaning of y;, in Section 13 we considered an assemblage of atoms, all 
in the 7th stationary state. By measuring the position of the electron 
for every atom, we can determine the probability of any position xyz, 
and we assumed that this probability, as a function of xyz, could be 
expressed by ¥,;*. We may, however, choose, instead of xyz, the 
momentum of the particle and inquire about the probability that a 
measurement of p will furnish a vector with the components &, 9, ¢. 
Wave mechanics maintains that this probability is given by the norm 


50 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


xi(Ent)xs*(én¢) of a function x;(énf) which is connected to y;(ryz) by 
the relation 


xi(Ent) = Wf yaaye)e™ ettrt= dy (60) 
To prove this we show first that the integral of x;x;* over the whole 


momentum space has the value unity, as is necessary if x;x;* is to 
satisfy the stated interpretation. Letting dp = dé dn df, we write 


+o 
a » Xxixi* dp 
=n [> dp [ vt(aye) do f yila'y's) MeOH evr e—on dy 
(61) 


To evaluate the right-hand side, let us consider an expression of the 


form 
+o 4 2 ; 


“=o 


When the limits +g of the second integral tend to +, sin gx/h 
becomes a rapidly oscillating function of x, permitting the integrand 
to be effective only in the immediate neighborhood of z = 0, whereas 
in any other interval dz the oscillation causes a zero result. Thus we 
have 


gq! 


naa asl OE +° gina’ 
/ (z)\ de i a eh dt = 2hf(0) / dx’ = 2rhf(0) 
This equation can be generalized to give 
[ZF saya) ao [7 eo e+ dp = 8u*h'f(000) = H°4(000) (62) 


Equation (61) can be transformed by making use of the above equa- 
tion. First write (61) in the form 


/ igi xixi* dp 


=fA-3 if v*(xy2) dv if vi(x'y’2’) dv’ pia eM ar—2) E+ a) rt 2) E) dy 


In the dv’ integral set 2” = 2’ —a,y” = y' —y,2” =2' —z2. Then 
y;(x'y'z') changes to a function $(2’’y"’z’’). Thus 


| Wila'y’z’) do’ oe ei/MI@er—a)e+ ++] dp 
= / o(a!y""2!") dv!” be eta) arth +++) dp 


But, according to equation (62), the last expression is h®¢(000) = 


Sec. 16] THE PRINCIPLE OF TRANSFORMATION 51 


h®y,;(xyz), and so we find that 
+2 
i xixi* dp = il viai* dv 


As the integral has the value unity, the theorem is proved. 
The next step is to show that 


+o 
t R xixs*Edp = & = pr 


which is necessary if x;x;* is to represent the momentum probability 
within dp. Now 


if xixs*é dp = h-* / dp iP ¥i*(xyz) dv / Vi(a'y’2!) Ee UML —aret e+] gy 
If we transform te“/*I@—DEt-"] into (h/i)(d/dx')eWPMl@ wet =), 
the integral relative to dv’ becomes 

tae Paget hate 


= A ela tat\ AO ++) ed badd hs GAC) dy? 
5 Vileiy’e'ye e dv 


—% tJ. ox’ 


Let us suppose that y; vanishes when x becomes very great. Then we 


have 
—i-* fap f verteyeyan® [ 2 owe ray 
a Ox 


+o 
po Xixi*Edp 
= Yet ot de 
a Ox 


which according to equation (55) is truly the expectation value j,. 
Evidently equation (60) represents nothing but a Fourier expansion 
of the function x;(énf) and permits the inversion 


Y(oye) = W* f xcCEntye O&M ettarten (60’) 


Thus the same stationary state can be represented in q space by a 
function y;(ryz) as well as in p space by a function x,(§n¢), wherein 
y; and x; are connected by relations (60) and (60’). This means that 
any state of undefined energy can be described either in g space by 


u(ay2t) = > cabie“* (63) 


or in p space by 
w(Enst) = ) cixse/*H (63’) 


52 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


Thus we get two equivalent descriptions of the state which are related 
by the equations 


u(ayet) = hf w(enghe" MM ettvrt=P dp (64) 


w(Entt) = h-* | u(acyet)e/™ @ttyrted dv (64’) 


These equations furnish a good example of a principle which plays an 
important part in quantum mechanics. Let us consider an atom the 
momentary state of which in g space is described by a certain function 
u(cyz), dispensing from consideration the time-dependence. We 
know the meaning of the function, for wu* gives the probability that a 
measurement will locate the electron at xyz. In order to arrive at 
equation (64), which expresses the connection between u and the 
representation of the same state in p space, consider the operator 
—(h/i)(d/dx). If a function of x is subject to this operator, we get 
another function which we may call the image of the first. Generally 
the operator changes the character of the function completely. For 
example, x” is changed by —(h/i)(0/dx) into —(2h/i)x. However, 
there exist certain functions f(xyz) which are changed by the operator 
into a multiple of themselves, so that we can write 


where a denotes a constant parameter. The functions which satisfy 
the above equation are given by f = $(yz)e“’**, a being an arbitrary 
function of yandz. These functions f are called eigenfunctions of the 
operator —(h/i)(d/dx). Every eigenfunction corresponds to a certain 
eigenvalue a. With the help of these concepts, we now can assign the 
following meaning to equations (64) and (64’). In order to determine 
the probability that a measurement of momentum will furnish a value 
Dr = €, we must express the function u(zyz) in terms of the eigen- 
functions belonging to the operator —(h/7)(0/dx), that is, in terms of 
the functions e~“”**, When this is performed by the Fourier 
expansion, 


u(xyz) = h-” / w' (yzt)e VE dé (65) 


the coefficient w’(yzé) of a definite e~“””* is the probability amplitude 
for pr = &, or ww* is the probability that we shall find éfor pz.f It is 

+ Strictly speaking, the coefficient of e~/)7€ in (65) is not w’ but w’h~*. The 
factor h—” is due to the failure of the normalization rule (45) in the case of a 


continuous spectrum. The rule must be replaced then by another. Cf. Section 
30. 


Sxc. 16] THE PRINCIPLE OF TRANSFORMATION 53 


seen that the probability, through y and z, depends on the place where 
the measurement is carried out. This does not contradict the experi- 
mental possibilities since pz, y, and z can be measured simultaneously 
with precision. 

Now we take into consideration p, as well as pz and look for the 
probability of a measurement furnishing & and » for pz and p, respec- 
tively. We consider the function u(xyz) in its dependence on y and 
expand it in terms of the eigenfunctions belonging to the operator 
—(h/i)(d/dy), that is, in terms of the functions e~“”*”, This means 
an expansion of the coefficient w’(yzé) in equation (65) and leads to 
the equation 


u(xyz) eo a lf w! (zine 0 tun) dt dn (65’) 


In this expansion the product of two eigenfunctions appears with the 
factor w’’(zém), and this factor gives, by means of w’’w’’*, the probability 
we seek. Finally a third expansion performed by using the eigen- 
functions of the operator —(h/7)(9/dz) provides equation (64). 

Thus, when a state of the atom is represented in g space by a func- 
tion u(«yz), we find the statistics of the momentum components by 
expressing u in terms of the eigenfunctions of the operator — (h/7)(0/dq) 
or, in other words, by resolving wu into the spectrum defined by the 
eigenfunctions. We generalize this theorem by maintaining that any 
mechanical entity can be represented by an operator with eigen- 
functions fof; - - - and eigenvalues ao, a), - - * , which are related 
to the entity in such a way that (1) the eigenvalues a represent the 
possible values of the entity as found by measurement; (2) The expan- 


sion of u into 2 c.f; furnishes a statistical ensemble of the a values by 


means of ¢;”. For example, consider the energy EZ. It can be seen 
readily that for a single particle, on which a force field V(ryz) acts, 
the corresponding operator is given by 


ele 
— 57, V" + V (aye) (66) 
because the eigenfunctions of this operator satisfy the equation 
oe 
| - om’ + Viave) | f = of 


This is the wave equation (27) which, as we have seen, can be solved 
only for a discrete set, a = Eo, Ey, ++ - , by the functions yo, yj, 


54 WAVE MECHANICS OF STATIONARY STATES [Cx. 2 


+++. Therefore the a values are in truth the possible values fur- 
nished by a measurement, and the c; terms of the expansion u = 


y cw;, through le,|?, determine the probabilities that the atom will 


assume one of the values E; when a measurement of # is made. 
Moreover, the theorem stated applies also when the state of the 
atom is represented in p space. In this case, in order to. arrive at 
equation (64’), we must correlate the operators (h/7)(0/dé), (h/1) 
(a/an), (h/i)(8/a¢), together with the eigenfunctions e“”*, eter 
e@/*et to the coordinates zyz. This means that it is the expansion uf 
eae in terms of the functions e“/*” *#¥7** that provides the proba- 
bility amplitudes for any position xyz. Here the operator associated 

with the energy is given by ; 


+747 ( ~) 
2m ‘ie top 


in which V[(h/i)(0/dp)] signifies the expression into which V(zyz) is 
transformed by substituting (h/7)(0/d&), (h/7)(A/dn), (h/i)(A/0f) 
for a, y, 2, respectively. The eigenfunctions of this operator are the 
functions x;,(&n¢) defined by equation (60). 

Thus, when we adopt a certain representation of the state, any 
observable of the system can be correlated to an operator. This fact 
enables us to infer from the given statistics of certain observables the 
statistics of any other observable. More exactly, when the state is 
described by the probability function u(xyz), it is determined statisti- 
cally by this function with reference to all observables. Expanding 
u(ayz) in terms of certain eigenfunctions, we can anticipate the statis- 
tical results of measurements dealing with, for example, energy or 
momentum. ‘To the energy we must coordinate the operator (66) so 


that u is expanded into u = > cw; When this equation is multiplied 
by ¥;* dv and integrated over the whole space, we get, by virtue of 
equation (45), cy = if uy;,* dv. In this way we can evaluate from 


u(xyz) the probability amplitudes of the energy. On the other hand, 
the operator of a momentum component is —(h/7)(d/dq). Accord- 
ingly in the case of momentum it is the expansion (64) that informs us 
about the probability amplitudes defined by equation (64’) as 


w(ént) i if OP dhe “gat Pe 


When, however, the state is described in terms of momentum statis- 
ties, that is, when the function w(énf) is given, the operators belong- 


Sc. 16] THE PRINCIPLE OF TRANSFORMATION 55 


ing to energy and coordinates are 


24 3 48 
SES entre) wa Se 


‘and the corresponding amplitudes c; and u can be evaluated by means 
of the equations 


—(i/h) (wt+-ynt2t) 
dp 


= | wxa* dp and u= [ we 

Finally let us consider the case in which the state is defined by the 
probability amplitudes c; of the energy, that is, the description refers 
to the E space. Here the determination of the operators belonging to 
coordinates and momentum is not so easy. They are no longer differ- 
ential operators since these could not be applied to something defined 
by a discrete set of numbers, ¢o, ci, - * - , but are given by matrices 
with which we shall deal in the next chapter. 

Thus quantum mechanics establishes certain transformation equa- 
tions by which the statistical statements that can be made for a 
system in a given state are interrelated. However, in order to arrive 
at a valid mechanics, we must complete the plan by means of a law 
according to which the state of an unperturbed system changes with 
time—a law that can be formulated to agree with the time-dependence 
required by equations (47) and (63). This law can be stated as fol- 
lows: The probability amplitudes c;(t) of the energy of an unperturbed 
system, as far as time-dependence is concerned, are given by c,(t) = 
cet, where c; represent the amplitudes for t = 0. According to 
this law the probability c;(t)c;*(t) of finding the system in a certain 
energy state does not change in time, since c,(t)c;*(¢) = ¢;c;*. Thus, 
if the system is in a definite stationary energy state H;, at t = 0 (that 
is, if c, = 1, all other c; being zero), it remains in this state as long as 
no disturbing forces act on the atom. This implies that in quantum 
mechanics there are also observables which, by their initial values, are 
uniquely determined for all the future. If we can infer from a meas- 
urement that at ¢ = 0 the energy of the system is E;, we may be sure 
that a measurement at any future time will furnish the same value. 

The manner in which the probabilities wu* and ww* are affected by 
the time-dependence of c; has been discussed in Section 13. 

The transformation equations are not quite new to us. We used 
them unintentionally in Chapter 1 when we represented the amplitudes 
u(ayzt) of a free particle by 


u(ayzt) wes J alente™ te cetturten) dp (12) 


56 WAVE MECHANICS OF STATIONARY STATES © (Cu. 2 


This expansion arose from the idea of correlating a wave packet with a 
particle so that we might be able to predict the probable results of 
future measurements. Now we can see that equation (12) can be 
looked upon as the connection that exists between the functions 
u(ayzt) and w(énft) according to the transformation theory. For a 
particle on which no force is acting, w(énft) signifies the probability 
amplitude of a certain energy EH, since here E depends only on &, n, ¢ as 
given by E = (& + »* + £°)/2m. Therefore w(én¢é) plays the part 
of c;(t) and depends on time by the factor e“””**. Thus the expansion 
coefficients a(énf) in equation (12), which originally were considered 
amplitudes of de Broglie waves, now appear to have the significance 
of a statistical ensemble referring to the momentum components. 
Let us make it clear also that equation (29) of Section 9 


u(xyzt) eis [ff ve®a(xyzHap)e/™ lets @yzkas)| dE da dg 


which corresponds to the motion of a particle in a field of force, is in 
agreement with the transformation theory. The function u is here 


expanded in terms of the eigenfunctions y;, that is, u = bi ce Bt, 


In the place of c; in equation (29) we set ye® dE dadg. The factor 
ve” is the one mentioned in Section 9; it remains arbitrary in the 
determination of y;. The summation sign is replaced by an integral 
sign because Schroedinger’s equation, within the approximations of 
geometric optics, is solved by a continuous sequence of functions. 
Accordingly y; is replaced by the functions a(xyzHaB)e~ S$ t2848) py 
means of which the wave equation can be integrated within the con- 
sidered approximation; therefore they are to be used as eigenfunctions. 

Finally let us consider the expectation values of Section 15. The 
formulas that have been evaluated there are, as can be easily seen, 
consistent with the following rule: 


When a state is described by a function u(ayz), the expectation value 
of any observable is found by multiplying u* with the funetion into 
which u is changed by the operator belonging to the observable and 
then integrating this product over the whole configuration space. 


This rule can be used inversely for determining the operator from the 
formula for expectation value. For example, for the expectation value 


Pz” we obtained the expression | u*(h/t)?(0?u/da”) from which it 


follows that the operator of p,” must be (h/i)?(d?/dx”), that is, it is 
the reapplication of the operator —(h/7)(0/dx). The operator associ- 
ated with the x component of the angular momentum is, according to 


Sxc. 17] THE LINEAR OSCILLATOR 57 


equation (59’), given by —(h/i)[y(0/dz) — 2(8/dy)]. If repeated n 
times, this operator represents d,”. Moreover it follows from # = 


uu*z dv that the operator belonging to the x coordinate consists in 


the multiplication by x. 

The rule connecting expectation value and operator can be applied 
easily to a state described by the function w(t). Here we must 
multiply w by the function into which w is transformed by the operator 
and then integrate over the p space. Then we must correlate the 
coordinate « with the operator (h/7)(0/d£), whereas é is represented by 
multiplication by & Accordingly we have 


= [ww tap za [weap (67) 


17. The Linear Oscillator. At this point we wish to apply the 
methods of quantum mechanics to some examples. We shall consider 
first a simple linear oscillator, hence a system with but one degree of 
freedom. Let a particle of mass m move in the x direction subject to 
a force V = ax*/2 centered at the origin of coordinates. The wave 
equation is given by 


is% Van" 
a v+ i dig Ey 
or 
ay ( =) 2m 
a” + iasigry 7F 


On dividing through by V ma’/h* and introducing a new variable 
a! = 2 Vma/h*, we obtain 


ea 5 % +(e! va. a v)y =O (68) 


It was shown in Section 11 that only for certain values, Eo, Ey, - - - , 
of the parameter do regular solutions, Wo, ¥1, - - - , of equation (68) 
exist. It can be proved that these eigenvalues E; are given by 


h? 
ie me Vie (2i + 1) (69) 


and correspond to the eigenfunctions 
Vi = aye?" H(z’) (70) 


the constants a; being arbitrary and H; representing certain poly- 


58 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


nomials, called Hermitean, which can be defined by 


uate e 
H(z’) = (+1)'e" dz’ (71) 
From this we see that 
Hy =1 
de~*” 

Es OR, 

A, e” da’ 22 
2 —2!2 

Hy = e*” or = 477-2 ete. 


Generally H; is a polynomial of the 7th order, beginning with (22’)': 


a(t — 1) ai — 1)(4 — 2)(¢ — 3) 
1! 


“ (22')*-4 ates 


- (2a')* Le (22')*-* 4. 
The proof that equation (68) is solved by (70) follows. From (71), 
for H;,1 we have 


ERAS ic” 
. rae ee itl, ais 
H t+1 ( 1) dx’ at 
= a2 a ee?" dH; , 
= -#"— : H; (72) 


On the other hand, H;,1 can be written 


d’ de-*" di : 
dz* da’ agi we) (78) 


Now the ith derivative (d‘/dx')(x'e~*”) can be transformed step by 
step in the following way: 


eee mm a (= eo) 
qe eit how ea Gils BF 8 rae 


G} (7 *—) ev at? (<> 4 je) 
deli \* ae J ae 3\ ag "* 


6) EPS. ore (8-1 e ee te elie? et Boe 8 es Cle ear ewe, foe (em Siewert ee re cee eee 


= (—1)'2e"”” — 


Hoe (yee 


By continuing this procedure 7 times and summing up all equations 
we get 

‘ ; di-1¢-2” de —ae! 2 

v dz!*—1 ue da’? 


Sxc. 17] THE LINEAR OSCILLATOR 59 


When we introduce this rineeny into (73) we obtain 


f i—1 em’ di —z! 
His = (— nre"(i — oer cdg oe) = —2iH;1 + 2a'H, 


From (72) and (74) it follows that 
aH 
da’ 


On differentiating (72) relative to x’ and substituting 2(¢ + 1)H; for 
dH;,;/dx’ according to (75), we have 


27H ;_1 be (75) 


dH; 


da! + 2H; 


2H, 
a+ DH. = - daa 


or, since H; = (1/c;)y:e"’”?, 


2 . 
Qi + 1)we””? = — (a r + 22’ na + 2p; + ) las 


+ 20! ( + v's) 8 Dye? 2 


Finally we obtain the equation 


2, 
oH + Qi +1 — 2 =0 


from which it is seen that (70) is truly a solution of (68) and that the 
corresponding eigenvalues are given by EZ; = (h/2) V a/m (2% + 1). 
The solutions e~*’”*H; of (68) form a system of orthogonal functions, 
that is, they satisfy the condition | e?” da'H;H;, = 0 for i ¥ k. 
This follows immediately from the orthogonality of the eigenfunctions, 
as proved in Section 13. In order to fulfill the normalization condi- 
tion as well, the a; terms are chosen in such a way that, for any 7, 


| Wi? dz’ = a;* J e 7? Ha”? dz’ 
4152 i —zxr/2 
ata | ner % — dz’ =1 
ma dzx’* 


We can transform the integral by integrating by parts n times. This 


gives 

ae : cr 

e*” —_ H, da’ = 2*i! en" da! = 24i1-Vx 
dx’ 


60 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


The solutions are normalized if for the a; terms we choose the values 


1 ma 
“= Alar 
Qile® VK? 


We are especially interested in the energies Z;, which, according to 
(69), are given by 
+1 gE 
Be 2 m 


As V a/m/2r is identical with the classical characteristic frequency vp 
of the oscillator, we may write H; = (27 + 1)(hvo/2). Itis remarkable 
that wave mechanics furnishes odd multiples of hyo/2 for the energy 
levels of the oscillator, whereas, according to the original quantum 
theory E should be even multiples of hvo/2. This difference is irrele- 
vant for the evaluation of the spectra, since the frequencies of the lines 
depend only on the differences of the energy levels. It indicates, how- 
ever, that the oscillator should have a zero-point energy of hvo/2 in 
its lowest state. Indeed, there is some evidence that such a zero- 
point energy exists. The important point, however, is that the theory 
provides the correct distances of the energy levels and is able to achieve 
this without the help of any supplementary assumptions. In wave 
mechanics the selection of a discrete sequence of states is an effect 
of the regularity conditions imposed on the solutions of the wave 
equation. Thus the selection occurs just as naturally as in the case 
of a string which, because of boundary conditions, is capable of vibra- 
tions with an integral number of nodes only. In short, we do not have 
to resort to unintelligible postulates to attain the quantum states 
since the quantum conditions already are implied in the plan of the 
new mechanics. 

According to equation (70), the eigenfunctions are given by ¥; = 
ae *”?H(2’), If now we substitute for x’ the value of z given in the 


equation . 
: 4r?ma ‘ am? my 
0 
vm aN = meas, = ON 
we obtain 


The first five eigenfunctions are plotted in Fig. 2. It will be observed 
that outside the considered domain, from s = —3 to x = +3, the 
functions tend asymptotically to zero very rapidly. It should be 


Sec. 17] THE LINEAR OSCILLATOR 61 


noticed that every eigenfunction shows several nodal points the num- | 
ber of which is the same as the 7 number of the eigenfunction. This 
follows at once from the fact that the polynomials are of the 7th 
degree. Consequently, when the oscillator is in the 7th stationary 
state, there are 7 places in which the electron is never found. 

The functions y; are partly negative, and, although this fact does 
not affect the probabilities ¥,y;*, it is important for states represented 
by u = Y, cadse Rt, 

The distinction between the description of the oscillation according 
to wave mechanics and that according to classical mechanics should 
be understood clearly. A stationary state no longer has the character 
of motion in the sense of kinematics. We can only specify the proba- 


Fia. 2. 


bility that a particle will be found at any point on the z axis when a 
measurement of position is made. However, a broad indication of 
the classical oscillation is present, because, according to classical 
mechanics, for every ¥; there is a definite relation between the position 
of the maximum and the amplitude of the oscillation. For example, 
the maximum of y; lies at x’ = 1,1, and therefore z = 1, 1 Vv h/4r’mvo. 
In classical mechanics, however, the amplitude A, which is associated 
with the energy 3hv0/2, is assigned a value aA?/2 = 3hv/2 and there- 


fore A = V3 V h/4x?mvo. In general, it turns out that every eigen- 
function gives a probability that differs from zero only within a domain 
that approximately covers the range of the classical oscillations. 

Thus the stationary states of the oscillator contain only a few basic 
properties that may be taken as left-overs from the old pendulum 
oscillation. Nevertheless, according to Section 14, transition to this 
conception must be accomplished by considering states that unite a 
complex of eigenvibrations of high order, namely, those vibrations in 


which the terms of , ce’ are associated with large values of 7. 


62 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


It must be possible to superpose the vibrations so that wu* differs from 
zero only within a small domain. As we know, such a packet holds 
together for some time, moving through space like a point mass of 
classical mechanics. Schroedinger at one time worked this out in 
detail by a calculation which provides a good illustration of the transi- 
tion from micro to macro mechanics. 

18. The Hydrogen Atom. The theorem is subjected to its most 
important test in the application to the H atom. It is known that the 
energy terms of this atom could be determined correctly by the old 
quantum theory. The question here is whether wave mechanics 
leads to the same results. So that the cases of He+* and Lit++ may 
be included, let us assume that the charge on the nucleus is Ze. Dis- 
regarding the motion of the nucleus, we get for the wave equation 


ae ( Ze) i: 
sa VY +(B+—=)y=0 (76) 


On changing from rectangular to spherical coordinates wherein 


x =rsin 6 cos ¢ y =rsin @sin ¢ z=rcos@ (77) 


we obtain 
2) 4 19 (ain 92) 4 1 a 
+2 (+ orf Agana os T Fain? 0a! 
2 
rd Ze) y = 0 (78) 


To solve this equation we consider y the product of a function R(r) and 
Y(@g). Then equation (78) becomes 


a 2) _R 2s af) Rk vy 
vi(- aS fais be nae + cin? 6 a¢? 


+ a (Er* + Ze’r)RY = 0 


When this equation is divided by RY, the left-hand side consists of 
two parts one of which depends only on r and the other on @ and ¢. 
The sum of these parts can be zero only if each part has a constant 


value. If we denote this constant by k, we have . 


OR 
= 2 (22 aE) ip <7 (Er? + Zer)R = RR (79) 
i iad oY 1 gry 
sin 6 00 af r) + sin? 9d¢? ee 0) 


Sxc. 18] THE HYDROGEN ATOM 63 


Each of the above equations is of the type that can be solved by regular 
functions for certain values of k only. Let us consider first the second 
of the two equations and show that its solutions are given by spherical 
harmonics. These functions can be defined in the following way: 
Assume that wu is a homogeneous polynomial of the lth order in xyz 
which satisfies the equation V?u = 0. When we replace the rectangu- 
lar by spherical coordinates, u transforms into r'Y,(0¢), where Y; is a 
function that depends only on @ and ¢ and therefore represents a 
function on the surface of a sphere of radius unity. These functions 
are called spherical harmonics of the /th order. When we set u = r'Yi, 
the equation V?u = 0 becomes 


EE PTO sey; wy 0; 
Yi + 1) + an 000 (sin 6 =) + ares 0 (81) 
as should be obvious if in (78)r'Y; replaces y and the terms not due to 
the operator V? are eliminated. A comparison of (81) with (80) shows 
that (80) can be solved by the functions Y, if for k we substitute the 
values l(J + 1), where 1 = 0, 1, 2,-- +. This means that we have 
determined the eigenvalues and eigenfunctions of (80). No stress is 
placed here on the proof that other solutions do not exist. 

However, it is important to prove that there exist 2/ + 1 different 
harmonics of the lth order which are linearly independent. Linear 


independence means that, for a system of functions fi, fo, - * - , the 
equation cif; + cofe-+ -+~- = 0 can be fulfilled only if c; = c: = 
c; = -** =0. To prove the theorem recall the equation V7u = 0 


by which we have defined the functions Y;. Let us take those of its 
solutions which are homogeneous in xyz and of the /th degree. In 
order to determine the solutions which, divided by r’, will yield Y;, 
it is convenient to introduce new variables £, 7, ¢ for xyz, defined by 


gé=a2+y n=a2—ty (ste (82) 
so that 
F) F) a 3 a a? 3? 
ao ae a sa aa?” an? 
Of. s0b von Ox dé d—Edn On 
F) a) a . a a 9? 
Se tabi er ew 
dy dé an dy” ae d& dn On? 
and thus V2u can be transformed into 
au au 
— =0 83 
dé On + a2? 


64 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


This equation can be solved if for wu we choose the special form of the 
homogeneous polynomial of the /th order defined by 


u = E2"(1 + ayinz* + act ?y’2t +--+) (84) 


In this equation the exponent of £ exceeds that of 7 by m, where m is an 
arbitrary whole number between 0 and J. When we introduce (84) 
into (83) we obtain 


4ay(m + 1)e"2"—* + dao(m + 2)2e"tye""4 + + + - 
+ n(n — 182" + ai(n — 2)(n — 8)e"F ye" “4 + + + + = 0 


wherein again the exponents of ~ and 7 in the terms containing these 
differ by m, every term occurring twice. If, then, in order to satisfy 
the equation we set the factor of every product £n*z’ equal to zero, 
we obtain a recurrence formula by means of which the coefficients are 
determined uniquely. In this way we see that for the equation 
V’u = 0 there exist 1 + 1 different homogeneous solutions which are 
linearly dependent and of such kind that in every term the exponent of 
& exceeds that of 7 by a whole number m = 0. 
On the other hand, (83) can be solved as well by polynomials of the 
form 
n™2"(1 + ayéne—? + agt?n?2* + ++ + +) (85) 


wherein m is again a positive whole number < 1 but zero is omitted this 
time because the corresponding solution has been considered already. 
Thus it is true, as we have maintained, that there are 2/ + 1 linearly 
independent solutions of the /th degree in nz, and hence in zyz also. 
When divided by r’ they transform into as many spherical harmonics 
Y,;”, the index m signifying the difference between the exponents of £ 
and 7; it can be any number in the sequence —l, —1+1,---, 
eee og iy & 
Reverting to spherical coordinates by means of the relations 


E=a2+ty=re*sind y=x2-—iy=re*snd 2z=rcosé 
then according to (84) for values of m = 0 we obtain 
Y,” = e*™ sin” 6 cos” (1 + a, sin? 6cos~? 6+ ---) (86) 
and for values of m < 0, according to (85), we get 
Y;™ = e*” sin” 6 cos” 0(1 + a; sin? cos 20+ ---) (86’) 


The factors of e’”* sin” 6 and e’”* sin—” 6 sometimes are called the 
correlated spherical harmonics and may be designated by P,”. 
Two spherical harmonics, Y; and Yj,, of different orders are orthogo- 


Sxc. 18] THE HYDROGEN ATOM 65 


nal. This means that the integral of Y;Y;,* over the surface of a 
unit sphere vanishes. To prove this we apply Green’s theorem 


J (uV*v — vV?u) dv = / (u% —-v» a) df 


to the unit sphere, wherein u and »v represent the functions Y;7r 
and Y;*r* respectively. The left-hand side vanishes because Yz7r 
and Y;*r* are solutions of the equation V?u = 0, but the right-hand 


side yields ; YVi¥.*(k —Drtt— df. Thus J Yi¥.* df = Oforl ¥ k. 
The orthogonality also holds for two different harmonics Y,” and 
Y,” of the same order, since 

[wearer df = i et(m—m") da / d0-::= 0 (for m # m’) 


After this digression we return to the wave mechanics equations, 
(79) and (80), of the H atom. As we have seen, the second equation 


can be solved only if k = l(J + 1) where / = 0, 1, 2, ++ , and for 
every one of the eigenvalues we obtain not one but several eigen- 
functions Y as defined by Y/"(m = —1 +++ +1). Thus one part 


Y of wu = RY may be considered determined. The other part must 
be found by making use of equation (79) which, when k = l(/ + 1), 


becomes 
ld 2) an Ze? il+ | a 
r dr er dr + h? vie r Qmr* %=¢ (87) 


When we set 


aD ae aT ae? =Bn 1+ =e 


and perform the operation 


d 2 dR dR oR 
ea r? vs a ah i = 150m) 


(87) takes the form 
a’ Be a 


When we introduce a new variable v by setting rR = ve’, the equa- 
tion becomes 

d?v dv ( ee 5) 

ae — OG, HA hs)? (89) 


66 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


If we choose a = V —A, with the assumption that in the following FZ 
is negative so that a is real, then A + a? in equation (89) vanishes. 
It can be shown that a differential equation of the form (89) can be 
solved for certain values by the polynomial 


v= ¥ air! ; (90) 
0 


n being a finite number. To determine y and a; we introduce (90) 
into (89) and obtain 


Zaly + t)(y +¢ — Ur? — 2aXai(y + rt 
+ Za(Brt* — Crt?) = 9 


In order to satisfy this equation the factor of every power of r must be 
set equal to zero. As the lowest power is y — 2, the associated factor 
being aoy(y — 1) — Cao and as C = (1+ 1), we obtain y =14+1 
or y = —l. The second value is to be excluded because it would 
correspond to a function v that is not regular at the point r = 0. By 
setting the other factors equal to zero we arrive at a recurrence formula 
for a; wherein the first coefficient a9 remains arbitrary. In order to 
have the series a finite one the a; terms must vanish from a certain 
7 = n upwards, that is, dnz1 = Gny2 =**: = 0. To achieve this we 
determine the coefficient of r’*"—! for which we have 


Ongily +2 + I)(y +2) — 2aan(y +) + anB — anyiC 


And, as @n41 is to be zero, it is necessary that 2e(y +n) =B. Hence 


B mZe 
%° VNTAS ri Pm. PhS Ped) 


And, since A = (2m/h”)E, 
mZ¢4 
Wn +14 1)? 


Thus, on the assumption of a negative E, the wave equation for R can be 
solved by a regular function only for those values of E that are identical 
with the terms of Balmer. We do not consider here the proof that for 
E <0 other solutions do not exist. 

It is important to realize, however, that the preceding method has a 
meaning only for a negative HZ, because with a positive Z the poly- 
nomial cannot be limited to a finite number of terms. Then a becomes 
imaginary, so that the equality 2a(n +-y) = B no longer applies 
because B by definition is real. And so, when the energy is positive, 


E= (91) 


Sxc. 18] THE HYDROGEN ATOM 67 


the solution of (89) does not depend on certain values of E, there being 
a solution for any EZ. Thus the spectrum is no longer discrete but 
becomes continuous. 

We shall, however, confine ourselves to the more important case of 
the discontinuous spectrum. Here the eigenfunctions and eigenvalues 
of the wave equation are given by 


a me*Z” 
y ror Rni¥i Eni ae ~ Onn n~ l a 1)? (92) 
R»z being defined by 
qk eset. 
Rn = 7 e ain Yai (93) 


n 


where v»; designates the polynomials >» a,r’**, We see that for the H 


0 

atom a single number no longer suffices to characterize a given station- 
ary state. We need three numbers, n, l,m. The energy of the atom 
depends only on the sum of n and J, being independent of m. And so 
there exist different stationary states of the same energy, a peculiarity 
which is called degeneracy. Every energy level possesses a certain 
degree of degeneracy, the degree being defined by the number of the 
states that belong to the level diminished by 1. For the level n + 1 + 
1 = n’, the degree of degeneracy is n’* — 1, since for / all whole num- 
bers between 0 and n’ — 1 are admissible and for every one of these 
V’s the number m can assume 21 +1 different values. Accord- 
ingly the number of different states is 1+3+ --°- (2n’—1) = 
n'?, all of which have the same energy, —mZ*e*/[2h?(n + 1 + 1)”]. 
This means that the level with the principal quantum number n’ is 
degenerated n’* — 1 times. 

This concept of degeneration had been known in the old quantum 
theory. That theory was able to give to this concept a figurative 
interpretation, because in it the different states of the same energy 
could be distinguished by the elements of the rotational motion such 
as the eccentricity of the ellipses or the inclination of the orbit plane to 
the z axis. Wave mechanics cannot adopt this interpretation because 
the notion of a rotational motion no longer has a meaning. Therefore 
another criterion is required which will permit a distinction between 
states of the same energy and which can be found in the expectation 
value of the angular momentum. 

By applying the principles of the transformation theory let us exam- 
ine how a state described in g space by ¥ = R,:Y,” and therefore 


68 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


specified by three integers n, l, m is to be specified relative to the z 
component of the angular momentum. According to Section 16, 
the operator of this component is —(h/i)[xz(d/dy) — y(d/dx)], the 
meaning being that, if fi, fo, - + + are the eigenfunctions of the 
operator, that is, solutions of the equation 


and aja2 + + - are the corresponding eigenvalues, a measurement of the 
component can furnish a value a; only and we can obtain the probabil- 
ity of a certain a; by expanding y in terms of f;. The coefficient of the 
ith eigenfunction in that expansion is the amplitude for the result 
ds = a. 

The equation for the f functions can be transformed by using the 
coordinates r, 6, @ instead of x, y, z; thus 


Ff _ ot , Of dy | Of oe 
06 drdd  dydd  dzdG 


But 
xz =rsin 6 cos o y =rsin 6sin ¢ z2=rcos 0 
Therefore 
Ox oy dz 
i gD. pa age a a 
a abs 
and 
of of of 
re ee, ye a 4. 
ad ¥ 5, ths ay (94) 
Therefore the equation for the f functions reduces to 
h of 
‘. ee 


with the solutions f = e~“”**, @ being an arbitrary number. In 
order to find the probability that a measurement carried out on an 
atom in the state y = RY,” gives the value a for the component d, of 
angular momentum, we have to expand R,»,Y,” in terms of e~ “2%, 
But by (86) the function R,,Y;" as it depends on ¢ has the form 
e"*f(r@). Therefore the expansion reduces to the single term with 
a= —hm. And so the measurement of d, carried out on a state n, l, 
m leads with certainty to the result d, = —hm. 

Let us examine further the square of the angular momentum 
d* =d,?+d,>+d,’. In Section 16 we saw that the operator of 


Szc. 18] THE HYDROGEN ATOM 69 


pz’ consists of the repeated operation of —(h/7)(0/dx). Consequently 
we have to assume that the operator of d? is 


_7 Calanngy ASF si. een 3 
Y a2 dy 0z oy 


Thus, if we wish to know how the atom is to be described with respect 
to d?, we must find the eigenfunctions of this operator and carry out 
the corresponding expansion of y. Again it turns out that y = 
Rn»Y 7” is already an eigenfunction, as it is converted by the operator 
into a multiple of itself. In order to show this we differentiate y 
relative to 6, holding r and ¢ constant. Thus 

ay YVopar ay 


oy F oy, 
= —— = — rcos 8 co —rfr 6s ——rsin 0 
a0 ag of Or eS I ere 


Multiplying the last term by sin? ¢ + cos” ¢, we have 
oy ( oy 3) ( oy -) ‘ 
— =(z— —2z—)cos¢ —([y— —z—)sn¢ 


FY ee We dz dz oy 
Defining the components of the vector product by 
Ce) 0 0 ts) é te) 
L=y-—-2-—- Lawx2—-2- L=2--y> 
Oz oy Ox dz oy Ox 


and considering these components operators rather than factors, we 
have [ef. (94)] 


~ = cos dL, — sin ¢Lz =L, (95) 


Ce) 
ag 
Furthermore it is readily seen that 

aL, + yL, + zl, = 0 (96) 


It must be pointed out that, in all products containing L, the order of 
the factors is important. For example, L sin’¢ is not the same as 
sin ¢L. In the former, L also operates on sin $, but in the latter it 
remains unchanged. 

Because of (95), equation (96) may be written in the form 


_ ¢08 0 if, 
sin 0 do 
From (95) and (97) we obtain the values for L, and Ly: 


cos ¢Lz + sin dL, = (97) 


cos 69 
sin 0 d¢ 
cos 99 
sin 0 0¢ 


i fe) 
L, = —sin O55 — cos > 


0 ‘ 
Ly a arr al ec hd 


70 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


A repeated operation of L, gives the operator 

5 Co) a ¥ f) 

(«in o> + cos ¢ cot 055) (sin > ry + cos ¢ cot 05) 
3? 


: 0 ts) 5 
sin’ $ 593 + sin $ cos $=, cot 92 + sin ¢ cos ¢ cot 0 


LL 


2 


00 0g 


2 


0 
dg 00 


0 é 
+ cos? $ cot 6 + cos sin ¢ cot 8 
— sin ¢ cos ¢ cot? 6 ig: + cos? ¢ cot? phy 
00 ag? 
A corresponding expression holds for L,”, so that we obtain for D? 
3 a 3° a 
—h? (— t 6 — +? — +— 
(sa Toth apt SRS gh aga 
La a a 3 
—f2 — (sin @— eae 
E 600 (sin a o sintip ee 
When this operator acts upon the function y = RriY 7", the result is 
h7l(l + 1)y, since Y;” as a spherical harmonic satisfies the equation 


2 
mr ea er GN RE 


sin 0 36 sin? 6 a¢” 


—h?(L,? + Ly? + L,”) 


Thus y is an eigenfunction of d® with the eigenvalues Al(J + 1). 
Therefore, when a measurement of angular momentum is made on 
an atom in the state n, 1, m, the measurement being possible at least 


mentally, we find with certainty that d = h V1(l + 1). 

19. Discussion of the Solution. Comparison with Bohr’s 
Theory. As we have noticed already, the situation is more compli- 
cated for the H atom than for the oscillator. No longer do the eigen- 
functions form a linear sequence to which the numbers 1, 2, 3, - - - 
can be coordinated. They correspond to a three-dimensional mani- 
fold of numbers n, 1, and m. The reason for this evidently is that the 
system now has three degrees of freedom whereas the oscillator had 
but one. This means that a measurement of energy alone no longer 
suffices for the unique determination of state, for according to (91) 
the energy fixes the sum n + 1 only; in addition we have to measure 
the amount of angular momentum d and the component d, of the 
vector d in some direction. These measurements inform us about | 
and m, and so the state can be described by Warm = RniY7”. 

The old quantum theory characterized the states of the H atom by 
three whole numbers, n, k;, ke. The numbers originated in the 


Src. 19] DISCUSSION OF THE SOLUTION 71 


postulate that the phase integrals | p dq which belong to the three 


degrees of freedom should be given for a stationary state by whole 
multiples of h. This idea led to the equation 


mZe4 
~ Qh2(n + iy + hee) 


for energy. Thus the former theory also considered the problem of 
degeneracy, since all states with the same sum, n + k; + ke, agreed 
in energy. As regards the quantum numbers k, and ke, they were 
related to the angular momentum in such a way that k, + kp =k 
determined the amount of momentum d = hk, whereas k,; belonged to 
the component d,, the value of which was d, = hk. As can be seen, 
there is a close correspondence between the former quantum numbers 
k, ky and the wave mechanics numbers /,m. At the same time we must 
not overlook the fact that the significance of the whole numbers which 
now appear as eigenvalues from a differential equation is quite differ- 
ent in wave mechanics from that of the former theory. We need only 
compare the denominators in the energy equations (98) and (92) to 
see this. In (98) the n’ of a given energy EH = —mZe*/2h?n’? is the 
sum n + k; thus n’ + 1 different k values are allowed wherein k = 
0, 1, 2, °°+,n7’. The former theory therefore was bound to assume 
the existence of n’ + 1 terms of HZ, whereas experiment conforms only 
to n’ terms. In order to have the theory agree with the facts, the 
value k = 0 was excluded on the ground that k = 0 would correspond 
to a motion in which the angular momentum is zero. The motion 
then would be a linear pendular oscillation, the consequence being a 
collision between nucleus and electron. At first this assumption 
seemed to be plausible, but it led to insurmountable difficulties later. 
It was here that wave mechanics immediately gave evidence of its 
advantages, for in this theory n’ in the denominator of the energy 
equation represents the sum n+1-+1, and for a given n’ the 
values of / range from 1 = 0 to 1 = n’ — 1, this being in agreement 
with the experimental fact that the number of terms becomes n’. 
Accordingly wave mechanics, in substituting 7 + 1 for k, no longer 
needs to resort to doubtful assumptions. The assumption of a pendu- 
lum orbit at any rate would be of no use here, since in wave mechanics 
the concept of “orbit of an electron” is meaningless. 

There is still another divergence between the two theories, one 
closely connected with the interpretation of the number n’. As we 
have remarked before, Bohr’s circular and elliptic orbits appear in 
wave mechanics in the form of electronic clouds. The entirety of the 


E= (98) 


72 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


possible positions of the electron no longer forms a distinct orbit but 
rather a statistical cloud in which the former orbit is recognizable only 
as an indistinct condensation. If the orbits were diffuse only, we 
might expect that the semiclassical and the wave-mechanical models of 
the atom would have some features in common, for example, angular 
momentum. Contrary to this expectation, the wave-mechanical states, 
irrespective of their diffuse nature, prove to be something quite differ- 
ent from the corresponding states of the Bohr theory. Let us consider 
the ground state. It was defined, by n = 0, as a circular motion of 
the electron. As such it was assumed to have a certain angular 
momentum d together with a corresponding magnetic moment y, 
the relation between these two quantities being expressed by p = 
ed/2m. Since, for k = 1, d = h, the magnetic moment should be 
just a magneton he/2m, so that the atoms in a magnetic field would 
be expected to orient themselves parallel or antiparallel to the direction 
of the field, the consequence being the deflection of the atoms from their 
rectilinear motion by a sufficiently inhomogeneous field. Indeed this 
conclusion seemed to be strikingly confirmed by the famous experi- 
ment of Stern and Gerlach. 

Wave mechanics, however, takes a rather different view. In the 
ground state of the atom / = 0, and consequently d = d, = 0. Hence, 
for the ground state and in general for any state with J = 0 (all S 
states), wave mechanics denies the existence of a magnetic moment due 
to the rotation of the electron. This viewpoint follows from the fact 
that all states for which J = 0 are spherically symmetric, since they 
are represented by ¥ = RnoY, or ¥ = Rno, the spherical harmonic of 
the oth order being zero. Thus, according to wave mechanics, for an 
atom with / = 0 there is no plane that could be taken as an indication 
of a plane in which the motion of the electron takes place in such a 
way that the electron would be found there with prevalent probability; 
all planes through the nucleus are equivalent. This means that, apart 
from the spatial extension of the atom which we have yet to discuss, 
any similarity with the old atomic conception has vanished. 

But now the question arises as to how the Stern-Gerlach experiment 
can lead to a positive result if there is no magnetic moment in an S 
state. This experiment leaves no doubt that an Ag atom in the ground 
state reacts like a magnet. To explain this we must resort to the 
hypothesis proposed by Uhlenbeck and Goudsmit according to which 
the electron possesses an angular momentum d = h/2 of its own, this 
mechanical moment producing a magnetic moment eh/2m. As is 
well known, this assumption of electron spin has proved to be of extra- 
ordinary usefulness. In Chapter 6 we shall see that the spin of an 


Sxc. 19] DISCUSSION OF THE SOLUTION 73 


electron represents a relativity effect and is forthcoming from that 
theory as soon as the wave equation is formulated in a relativistically 
invariant way. But, as long as we confine ourselves to a non-relati- 
vistic treatment of quantum mechanics, we cannot cope with the spin 
in any other way than by postulating it on the basis of a hypothesis. 
The explanation of the Stern-Gerlach experiment then is that the 
electron belonging to an atom with / = 0 sets its magnetic axis parallel 
or antiparallel to the direction of the field acting on it, so that the 
moving atom must be deflected by an inhomogeneous field. 

In contrast to the S states with 1 = 0, the P, D - - - states with 
l= 1,2, +--+ are axially symmetric with respect to the arbitrarily 
chosen z axis, for here the eigenfunctions are dependent on 6 and ¢ 
through the factor Y;” = e'™*f(@) but in such a way that in yy* the 
angle ¢ does not occur. And so the probability of finding the electron 
at a given position is a function of r and @ only. Because of its 
axial symmetry, the atom now has an angular momentum with d = 
kh Vl + 1) and d, = hm. The angle between the vector d and the 
z axis is 0, the cosine of which is given by d,/d = m/V1(l + 1). 
Since, for a given 1, the magnetic quantum number may assume any 
of the 2/ + 1 integer values between —l and +1, there exist 2/ + 1 
different orientations of the vector d. This fact was known in the 
older theory, but now we can state the theorem without using the 
concept of an electron orbit. 

Another question still remains to be answered. The z axis having 
been chosen quite arbitrarily, how can it be explained that there are 
only certain ‘‘settings” of the vector d relative to this axis? Evi- 
dently this statement can have sense only if the z axis has some physical 
significance, since the atom itself cannot satisfy the required condition 
for any direction. The answer is that, without directly stating it, 
we have distinguished the z axis by assuming that component d, was 
measured in that direction. We can carry out the measurement only 
by having a magnetic field act on the atom in the direction of the z 
axis and examining the effect produced by the field. Thus the meas- 
urement constitutes an interference owing to which the atom is forced 
to assume a certain orientation with respect to the direction in which 
the atom is acted upon, and the assertion of wave mechanics is that 
the settings then correspond to a component d, equal to an integer 
multiple of A. Therefore it is not illogical to state that the component 
d, of the angular momentum is capable only of discrete values mh 
for any direction z. This does not mean, of course, that according 
to quantum mechanics there exists a vector the projection of which in 
any direction gives an integer multiple of A but that the measurement 


74 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


of d, furnishes such a result. The possibility of measuring simul- 
taneously the components d, and d, corresponding to two directions 
z and z’ is denied by quantum mechanics. When we know the value 
of d, we lose this knowledge by measuring d,’, since the interference 
produced by the second measurement destroys the orientation effected 
by the first one. 


0 2 4 6 8 q0) cle) | ae ae 
Fria. 3. 


Let us consider further the radial part R of the eigenfunctions. 
According to (93) Rx; is given by the equation 


—me2r/[h? (n+1+- 1) 


Ryr =e Val 


where v,; is a polynomial of the nth order in r. In the old theory 
h?/me* is just the radius a, of the first electron orbit; consequently 
Ry; also may be written in the form 


Rar = e Mant) 

From the exponent of e it is to be expected that with increasing r 
the function R,; will decrease toward zero, the more slowly the greater 
the value of n + J. This is in agreement with the former assumption 
that the extension of the orbits increases with n +k. In Fig. 3 the 
Ra» functions are plotted for different values of n and 7. Figure 4 
shows R,,’r? as it depends on r, wherein the abscissas are measured 


Sc. 19] DISCUSSION OF THE SOLUTION 75 


in units of a;3. The figure makes it clear that the domain within which 
y* is noticeably different from zero has approximately the extension 
of the old quantum orbits. For instance, for the ground state n = 
= 0, the maximum of R,,;’r? lies at x = a1; for the state n = 1 = 1 
it lies at r = 4a;, etc. Thus the distances of the maxima from the 
nucleus are identical with the radii of the orbits in Bohr’s theory. 


2 ee &, 6) Bania0 Je, 44. 26), 45 20 
Fria. 4. 


It is seen that the functions RF», except those with n = 0, vanish for 
certain values of r. Accordingly there exist nodal surfaces on which it 
is certain that the electron is never to be found. It can be shown that 
the number of such surfaces is given by the degree of the polynomial 
Unt/ 7" 

It must be emphasized, however, that the given theory of the H 
atom is not yet quite exact. It is based on a wave equation which is 
not invariant with respect to a Lorentz transformation and thus does 
not satisfy the requirements of relativity. As a consequence we are 
not able to account for the fine structure of the spectrum. In Chapter 
6 we shall see how the theory can be made relativistic. We only point 
out here that the improved wave equation furnishes energy terms in 
the expression of which the numbers n and / no longer appear in the 
sum n + / but in such a way that any combination n, / of the same 
sum provides another term. Thus relativity cancels part of the degen- 
eracy and in this way causes a resolution of the spectral lines into a 
certain number of components. 


76 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


20. Wave Mechanics and the Correspondence Principle. 
Transition Probabilities. We now return to the general investiga- 
tion of systems consisting of a single particle. In two examples we 
have seen that wave mechanics leads to the correct energy terms; 
therefore Bohr’s frequency condition, hy = E; — Ex, enables us to 
evaluate the frequencies of the spectral lines emitted by an atom. 
But, besides frequency, a line possesses intensity as well, and the 
question arises whether the theory can inform us about this also. 
This problem of intensity was beyond the competence of the old theory. 
The intensity of a certain line, »y = (EZ; — E;,)/h depends on the proba- 
bility of the transition i — k, that is, on the average number of atoms 
which undergo the transition in unit time. Bohr’s theory was unable 
to determine this number, and in particular it could not explain why 
certain combinations are never observed in the spectra, thus signifying 
that the corresponding transition probabilities are zero. In order to 
contend with this situation of complete helplessness, Bohr proposed his 
famous correspondence principle, which gives a procedure for evaluat- 
ing the intensities, and which, though unintelligible, was convincing 
by its success. By this principle one starts with the radiation that 
would be emitted by an electron moving in the 7th orbit according to 
Maxwell’s theory, its frequency and intensity being uniquely deter- 
mined. From this one infers the radiation corresponding to transi- 
tions from the 7th stationary state. This can be done in the following 
way. If w denotes the classical frequency belonging to the motion in 
the ith orbit, the electron’s motion can be represented, on making use 
of Fourier’s theorem, by 


c= » Zn COS (2rnwt + an) (99) 


since z, as well as y and z, is a periodic function of time and any such 
function can be understood as a superposition of harmonic oscillations 
of frequencies w, 2w, - + - and of amplitudes z,, which can be evalu- 
ated from the function by a well-known method. According to Max- 
well’s theory, every partial oscillator emits an electromagnetic wave of 
the same frequency and an energy, which, for unit time, is given by, 


162*e? 
3c* 


v4(an? + yn? + Zn”) (100) 


where 2n, Yn, Zn represent the amplitudes associated with » = nw in 
the expansions of z, y, and z respectively. Bohr’s correspondence 
principle sets down the following requirement: In equation (100) one 
must substitute for v the quantum-mechanical v;;-, and take this 


Src. 20] WAVE MECHANICS AND CORRESPONDENCE PRINCIPLE 77 


expression as an approximate measure of the intensity with which light is 
emitted by the spontaneous transition from the ith into the (i — n)th 
stationary state. It is assumed further that the transition i —> (i — n) 
does not occur at all if in the Fourier expansion the amplitudes z,, 
Yn, 2n Of the corresponding partial oscillations are all zero. 

Although this rule was only a makeshift one, it proved to be very 
helpful. It is true that the evaluated intensities agreed only roughly 
with the observed ones, but from the exclusion of certain lines an 
exact selection rule could be derived which was of extreme value to 
the former theory. Nevertheless, to be dependent on a rule that could 
not be understood produced an uneasy feeling, and consequently it 
became a matter of interest to see whether wave mechanics would be 
able to uncover the exact law hidden behind the principle. 

On the basis of the theory developed up to this point, a satisfactory 
solution of the problem is not yet possible. All our considerations 
have been concerned exclusively with systems that were supposed to 
be closed and thus subject to the principle of the conservation of 
energy. It is precisely this supposition which we have to drop when we 
attempt to learn something about the transition probabilities, for 
then we are dealing with processes in which the atom interacts with 
the surrounding radiation field by exchanging energy with it. Accord- 
ingly we shall not be able to attack the problem of transition proba- 
bilities until we learn how to handle composed systems. Neverthe- 
less we may, at this stage, try to discover a plausible method which 
will aid us in making an exact formulation of the correspondence 
principle. 

We have seen that any state of an atom can be described by the form 
u“u= bs ewe“ **, Tf, in accordance with the procedure of the cor- 


respondence principle, we wish to represent the coordinate z as a 
function of time, this can be done in quantum mechanics only by 
considering the expectation value of x in its dependence on ¢, for of 
itself the coordinate x for which the measurement may provide any 
value has nothing to do with time. Now for # we have 


ius | uu*s dp = | ~ > cae Pit > Cy, te Mt dy 


A | cc te/™ i—Be)t / nts? de 
which may be written in the simpler form 


== Y cien*mpce/™ HF (101) 


78 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


on letting 


| mbobe* do = mu; (102) 


The 2;,; terms are quantities which depend on the nature of the system 
and which, because of their definition, satisfy the condition 


Lik = tiu* (103) 
Therefore, taking together the x; and a,; terms, we obtain 


== » lexcx*|2| 20] cos (2rvizt + ax) (104) 
tk 


Thus according to wave mechanics the expectation value of x can be 
expanded, in terms of harmonic oscillations, into a series which, how- 
ever, does not contain whole multiples of a fundamental frequency w 
as did (99), but refers to the frequencies emitted by the atom accord- 
ing to Bohr’s condition. Thus, as far as frequencies are concerned, we 
now meet with a strict agreement instead of a correspondence, and this 
makes it probable that there is an agreement in the amplitudes also, 
so that |2;,{ may be assumed to represent the exact amplitudes and 
are to be substituted for tn, yn, Zn. As for the factors leicx*|, 
they evidently must not be considered as having a meaning for the 
intensity law because, in contrast to the experimental facts, their 
effect would be that, for an atom for which every c; except a certain 
c; equals zero, that is, an atom in a certain stationary state, a spon- 
taneous transition to other states would be impossible. Hence we 
come to the conclusion that the correspondence principle is to be 
formulated as follows: When an atom is in the ith stationary state, the 
probability of a spontaneous transition into a state k <i is given by 


6414 2 
a 7 aui|® + [yes]? + [2ud?) (105) 


As the transition is connected with the transition of energy (hy; = 
E; — E,), the intensity J of the radiation emitted by a great number 
N of atoms all in the same 7th stationary must be 


303 vie (|e? + lye? + |zeel?) (106) 
The 2z,; terms in this equation must be evaluated, by making use of 
(102), from the normalized eigenfunctions, since otherwise they would 
not be single-valued. Indeed, we shall see in Chapter 8 that the law 
(106) can be given an exact quantum-mechanical foundation. 


Sec. 20] WAVE MECHANICS AND CORRESPONDENCE PRINCIPLE 79 


Of special interest is the case where the 2;; terms vanish for certain 
values of ¢ and k. If ay;, yx:, and zz; are also zero, then according 
to (105) a transition 7 to k cannot occur and a line of frequency v, is 
not observed. But, if yz; and z,; equal zero with z;; having a value 
different from zero, we may assume that there is emitted a radiation 
analogous to that in Maxwell’s theory, where the radiation is due to an 
electron moving in the x direction. In other words, for this case we 
may expect the emission of light which is linearly polarized in the x 
direction. Finally, if only z;; is zero, we can expect elliptically or 
circularly polarized light. Indeed the partial oscillations 


|x| cos (2rvjzt + a) and |yed| cos (2rv;z¢ + 8) 


are then contained in and 9, which together represent an elliptic or 
circular oscillation. 

The application of these ‘selection and polarization rules” to the 
oscillator and the H atom leads to complete agreement with the facts. 
For the oscillator the theory requires that the only permissible transi- 
tions are those between two neighboring energy levels in which the 
light can be emitted or absorbed only in quanta hyo. With 


Vi = ae? (2’) x! = Ina Va 


h 


4r*myy 


we have 


+o 
et ceric / e-*" F(x’) H,(2')x! de! (107) 


When we integrate by parts, the integral becomes 
sell a 7° 1 wl OH * 
[na] 8 fee na 
We had for the function H; 


Thus (107) can be written 


h 
4x?*my 
Because the orthogonality condition holds for the functions 
e*’?H,(x’), the integral vanishes except for the values k = i — 1 


and 7 =k —1. Thus the selection rule permits only the transitions 
ttot —landitoi+1. For the first, (108) gives 


Chi = Ajay 


| e°"((H;1H, + kHy-1H;) dx’ (108) 
0 


80 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


hi va 
2 , 
B16 = Oi Ga, e* Ay_1* da 


ll 
bo 
=) | 
fn 
= 
a op 
o 
g 
| 
iF 
w 
sd 
| 
a 
Es 
- 
w 
Q 
& 


wherein for a; the value 
constant Opi 


y= = = 
Vx2i1 25 
isused. Because of the normalization factor the integral has the value 


unitv. Therefore 
RN al lad (109) 
Qa 


In a corresponding manner we find that 


EM Ree (109)' 


Bs. li = 
meee ee 2mvo 


In order to apply the selection rule to the H atom we must prove 
first the theorem that any homogeneous polynomial of the lth order 
can be expressed in terms of spherical harmonics Y. To do this we 
shall determine the number of terms in the polynomial. There is one 
term with 2’, two with x’! since z’—! can unite with either y or z, three 
terms with z’—*, and so on, to 1+ 1 with 2°. Hence an arbitrary 
polynomial of the /th order contains 1+2+3+:--+(+41) 
different products and as many constants. Now exactly the same 
number of constants is contained in the sum 


r(Yi tYriet--:) (110) 


which also represents a homogeneous polynomial of the /th order 
because, as we have seen, Y; can be composed of 2/ + 1 spherical 
harmonics Y;”. Therefore the number of linearly independent har- 
monics in (110) is 
(2 1) Fi +2) ess 
=(7+1+)+0-14+17-2)+--:- 
=(@4+1) 414+ @-)+0-24+-°- 
By suitably choosing the factors of the Y, it must therefore be possible 
to make (110) agree with the given polynomial. 


Using this theorem, we can show now that the H atom is capable 
only of transitions in which / is changed by +1 or —1. For this pur- 


Sec. 20] WAVE MECHANICS AND CORRESPONDENCE PRINCIPLE 81 


pose consider the transition from a state nml to n'm'l’ wherein I’ < 1. 
Then x is defined by 


Lnilm,n'l'm' = J nim" nrm dv 
=f PRuRwy de f aX *¥y™ de (111) 


where dw signifies a surface element of the unit sphere. When we 
set 2Y¥)" = rer Vy", orf Yy™ is a homogeneous polynomial of the 
order l’ + 1 and therefore it can be written in the form hilin 0 SOE + 
Yrs * * + ), 

We have then 


Lnim,n'l'm! = f PRR dr if Yi"*(Yva1 + Yeir+ ++ +) dw 


According to Section 18 the integral of Y;*Y; differs from zero only 
fori = k. On the condition that I’ < 1, the integral (111) can, there- 
fore, differ from zero only for] = 1’ + 1. Ifl’ > 1, we have to expand 
xY;"* and the condition turns out to be I!’ =1+1. Thus the only 
transitions possible are those in which l is changed by +1. 

This rule is of decisive importance for the interpretation of the spec- 
tral series, as it explains the fact that in the sequence of the Suk, D, 
‘+ + terms, which terms are defined by / = 0, 1, 2, - - + , two non- 
consecutive terms never combine. 

As regards the magnetic quantum number m, we first evaluate 


(c+ iy)nimnmy = f PRR dr f (% + iy)¥"*¥y™ do 


= [ Raker dr f cttw —mtn do ip dé 
The integral relative to ¢ differs from zero only if the power of ¢ is 
zero, that is, if m’ = m — 1. Similarly we find that (x — 10) teint! 
differs from zero only if m’ = m-+ 1. Finally Znim,n't'm’ Can be put 
into the form 


2nlm,n'l'm! = | PRyRyy ar | eft m—m) dg / dé 
from which we see that only for m’ = m does this quantity not vanish. 
Hence the following possibilities exist: 
Gi)m>mt+1: 


4 c= 0 ‘ 0 
(@ + ty)es = 9 (e = Hy)eem 29 ze: = 0 
. Ch = = Ya = + = = xe T/? Zhi = 0 
2 D 


This means circular polarization of the emitted light. 


82 WAVE MECHANICS OF STATIONARY STATES [Cu. 2 


(ii) m—> m: In this case xi = yer = 0 and Zp; is alone different 
from zero. The light is linearly polarized in a direction parallel to 
the z axis. 


It is true that these rules cannot be put to the test on the spectrum 
of the unperturbed H atom, for according to Section 18 the terms then 
are independent of m and thus the spectrum gives us no information 
about the change of m. But the degeneracy in m is cancelled when a 
magnetic field in the z direction acts on the atom. Then the 2/ + 1 
different m values give rise to as many different terms, and the selection 
rule comes into play. In fact, we can then see that an originally 
simple line when observed in the z direction is resolved into several 
circularly polarized lines which can be correlated to the transitions 
m—>m-—landm—>m-+1. 


PROBLEMS 


1. Find the Schroedinger equation for a rotator, that is, a rigid body rotating 
about a fixed axis. Show that the eigenvalues of the energy are given by En = 
n?(h2/2J), where n signifies successive whole numbers 1, 2, - - - and J is the 
moment of inertia, and that the eigenfunctions are given by Yn = ae+'"*, Discuss 
the spectrum emitted by the rotator. 

2. Show that for a rotator the axis of which is not fixed in space the wave equa- 
tion is given by 


Wea! Gate 2 1 oy WwW 
pattie bbl Es AO ce ee 
8 (sme) +e te nabs 


Find the eigenfunctions and eigenvalues of this equation. Compare with the 
equation for the H atom. 

3. How is it possible that the electron of an H atom in a stationary state can 
be found at any distance from the nucleus? Why is not this a contradiction of the 
equation EF = Exin + Epot? 

4. Show that dp/dt = K. 

5. Show that dd/ M. 

6. If the motion of the nucleus is taken into account, how is the Kepler problem 
to be solved? y then depends both on the coordinates x1y121 of the electron and 
on the coordinates r2y222 of the nucleus. Accordingly the wave equation becomes 


1? n? 
—— Viv +5— Vv + (EF — Vy =90 
2m, 2me 

Instead of royoz2 use the variables § = 21 — 22,7 = Yi — Y2, f = 21 — 22 and put 
v = o(x1y121)x(ént). As V depends only on £énf, the wave equation can be 
separated into two equations for x1y121 and fmf respectively. 


3 
WAVE MECHANICS 
IN MATRIX FORM 


21. The Idea of Matrix Mechanics. Prior to Schroedinger, the 
new quantum mechanics had been developed by Heisenberg, along 
with Born and Jordan, in the form of the so-called matrix mechanics. 
It was Heisenberg’s idea to replace Bohr’s theory by a plan into which 
no entities are introduced except those physically observable. In 
consistently following out this idea he was led to a theory the form of 
which was entirely different from wave mechanics but the contents of 
which later proved to be identical with it. 

In order to explain the nature of matrix mechanics let us recall the 
transformation theory developed in Section 16. We consider an 
arbitrary physical system under conditions that guarantee a pure 
case, that is, the following requirement must be fulfilled: when we 
take a great number of systems, all subject to the same conditions, 
and carry out measurements of any entity such as energy or angular 
momentum, the result will be a statistical array which, for the pure 
case, has to be the same for any sufficiently large part of the assem- 
blage. Suppose now that we asserted that a measurement of energy 
with a probability leo| * furnishes a value Eo, and that with a probability 
les]? furnishes a value #;, and so on. We then can characterize the 
given state of the system by the numbers ¢o, c; - « - , disregarding for 
the present the fact that the numbers c; are not uniquely determined by 
the probabilities. The transformation theory maintains that, for a 
system with one degree of freedom, these numbers c; define the state 
not only with respect to the energy but with respect to all other physical 
entities as well. More accurately, it is assumed that from the ce; 
terms we can infer the probabilities that correspond to the measure- 
ments of any other entity. If we suppose that the other entities as 
well as energy have only discrete values, with the probabilities le,’ ee 
le,” |?, - + + , it should be possible to evaluate the c;’, c;’ ‘, > + + terms 
from the ¢;. 

It seems desirable to formulate this idea in the following way. Ina 
space with an infinite number of dimensions let us correlate a vector 

83 


84 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


with the given state of the system and interpret the probability 
amplitudes ¢o, ¢1, C2, : * * as the components of that vector referred to 
a certain coordinate system K. Similarly we interpret the numbers 
c’, c’’, + + + as the components of the same vector relative to other 
coordinate systems K’, K", - - + . The problem of quantum mechan- 
ics is this: we have to determine the coordinate systems for all entities 
and fix their positions relative to an arbitrarily chosen reference 
system Ky. Then it is sufficient to know the components of the vector 
which represents the state relative to one of the systems in order to 
determine the probability amplitudes of any other entity by a trans- 
formation of coordinates. Thus by the c; terms which refer to one of 
the systems, not necessarily energy, the state is uniquely defined 
in all entities; this statement holds not only for the instant considered 
but also for the future, provided that we know the law according to 
which the vector changes with time. 

22. The Hilbert Space. Concept of Matrix. To carry out the 
plan described we begin with the representation of the mathematical 
means we shall need for our purpose. Our intention is to represent 
the given state of the system by a vector in a space with an infinite 
number of dimensions and to refer this vector to a coordinate system 
K. If the entity to which K belongs possesses an infinite set of dis- 
crete values as, for example, the energy of an atom, the axes of the 
coordinate system will scaffold a space the infinitely many dimen- 
sions of which are denumerable. At times, however, it will be neces- 
sary to use a concept of space with a continuum of dimensions. For 
example, we are faced with this necessity when we consider an entity 
the eigenvalues of which build a continuum. For the present, how- 
ever, we shall confine our discussion to a discrete spectrum and post- 
pone the discussion of a K system with a continuous infinity of axes 
until Section 30. 

In order to arrive at a geometric interpretation of quantum mechan- 
ics we require a geometric scheme different from ordinary Bncleett 
geometry in so far as it admits vectors the components 2, 21, °° 
of which are complex numbers. We express the magnitude of the 
vector x by the expression 


|x|? = zo*zo + ai*ti +++: = >, Ln*Ln (112) 
n=0 
thus assuring a positive real number. A space in which this equation 
holds for the vector x (provided that it is referred to a suitable coor- 
dinate system K) is called a Hilbert space. 
Equation (112) may be looked upon as the scalar product of x 


SEc. 22] THE HILBERT SPACE. CONCEPT OF MATRIX 85 


with itself. Accordingly we define the scalar product of two vectors 
x and y by 
x*y=2yotuytec: = ) In*Yn (113) 
n=0 
It follows from this that the scalar multiplication of two Hilbert vec- 
tors is not commutative, for 


yt yes (114) 
However, we see that the distributive law is satisfied: 
(x+ty)*z=x°zt+y*z (115) 


When x: y = 0, the vectors x and y are called orthogonal. For any 
axis of the coordinate system a unit vector e; is defined with the com- 
ponents given for e9 by 


%=1 WH =*°+ =0 
and for e; by 
to =0 %=1,% =a2=-:: =0, ete. 


Consequently any vector can be represented by 


K=2Qeotme+°*: = y Palm (116) 
n=0 


Among the unit vectors the following relations hold: 
e;:e; = 1 e;° e, = 0 (¢ # k) (117) 


From the first of the relations above we see that each of the vectors 
e; has a magnitude of unity. From the second relation we observe 
that they are orthogonal. Coordinate systems whose axes are orthogo- 
nal may be called normal. 

Let us now consider two normal coordinate systems K and K’. 
The components of a vector x when referred to these are x; and 2;’; 
therefore 


X = Geo + Me, + > + * = Zo'eo’ + ay/e;r’ + ** 


the primed unit vectors being associated with K’. Then, exactly as in 
Euclidean geometry, the x; and 2,’ terms are related by linear equations 
of the form 


@i/ = ato tant%+++: = > Aiklk (118) 


86 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


Thus the transformation from K to K’ is effected by a system of linear 
equations with coefficients that can be set in the following arrangement: 


400 G01 Go2 
410 G1 Ai2 
G20 6e1 «=e 


This arrangement is called a matrix; it will be designated in what fol- 
lows by A or |la,||. If the matrix is required to bring about the 
transformation from a normal system K to another normal system 


K', the coefficients must satisfy the condition that ), ait, by the 


transformation, becomes > ai*2i, since both sums represent the 


square of the magnitude |x|? of the vector x. When the a,, terms 
satisfy this condition, the matrix is called unitary. 

In many applications it is useful to interpret relation (118) not as a 
passage from K to K’, wherein x; and z;’ are the components of the 
same vector in relation to different coordinate systems, but rather as a 
linear coordination by which a vector x is transformed into a vector 
x’ with components z;’. In this case, x and x’ are different vectors 
referred to the same coordinate system. This interpretation is closely 
connected with quantum mechanics, since according to Section 16 
the concept of transformation plays an important part in transforma- 
tion theory. 

23. Addition and Multiplication of Matrices. In order to 
establish the rules for the addition and multiplication of matrices we 
choose to interpret a matrix as defining a linear transformation. It 
may be assumed that a first matrix A = ||a,z|| transforms a vector x 
with components x; into a vector x’ with components 2;’, whereas a 
second matrix B = ||b,,|| transforms the same vector x into x” with 
components z;’’. Then we define the sum, S = A + B, as a matrix 
that transforms x into x’ + x’’.. The elements s,;, of S must therefore 
satisfy the equations 


> sists = Y ante ch >, binste 
E P E 


from which it follows that s;, = a, + bi. Hence the rule for addi- 
tion is: two matrices are added by adding their elements. 

Now let us consider multiplication. If k is an ordinary number, 
we define the product kA as a matrix which transforms x into kx’ with 


Sxc. 23] ADDITION AND MULTIPLICATION OF MATRICES 87 


components kz,’, that is, kA = ||ka,||. Hence a matrix is multiplied 
by any ordinary number by multiplying every element by that 
number. 

Let us imagine that two transformations A and B are performed suc- 
cessively. First x may be transformed into x’ by A, and then x’ into 
x’ by B. Thus we have 


Xi = ) ante =i’ = ) dDinaty’ (119) 
} p 
We wish to know whether x can transform into x” by one transforma- 
tion P, and, if so, how the elements of P are to be evaluated from those 
of AandB. Considering the two equations (119), we can write 


x’ = b bixy1X1 
A 


or 


zi’ = ps Pit. = Pu = y DinGee (120) 
T P 
Thus the answer to the question is that the successive transformations 
A and B on a vector have the same effect as the matrix P with elements 


Di = Y bina. P is called the product of A and B and is written 


P=BA. In this equation care must be taken with respect to the 
order of the factors. BA is to be read from the right to the left, 
meaning that first A operates on x, and then B. The inverse succes- 


sion, AB = P’ with elements p,)’ = Y indir, gives a matrix that 


generally will be different from BA. To obtain the element in the 
position 7/ in the case of BA = P, we multiply the elements of the ith 
row of B with those of the /th column of A, whereas in the second case 
the th column of B is combined with the ith row of A. 

The beginner is advised to keep the multiplication rule well in mind. 
According to (120) we get the element p,; by writing down the letters 
a and b in the succession in which A and B occur in the product, 
attaching to them the indices 7kkl and summing up over all k. 

It is easy to see how three or more matrices are to be multiplied. 
D = CBA is the product of C and BA; therefore the elements of D 
are given by 

dye = > cup = Y ciubimdm 
l lm 
The same matrix is obtained by multiplying CB by A. Hence the 
associative law holds: 
C(BA) = (CB)A 


88 WAVE MECHANICS IN MATRIX FORM [Cu. 3 
The validity of the distributive law is obvious: 
C(A + B) = CA +CB 


If in a matrix A only the coefficients a;, with i = k differ from zero, 
the matrix has the form 


‘00 0 0 


A is called a diagonal matrix. Of special importance is the diagonal 
matrix wherein all the diagonal elements equal unity. Then it is 
called a unit matrix and is designated by lor Z. In this case A = £, 
and in equation (118) z,;/ = x; Thus the Z transformation is the 
idemfactor. The multiplication of an arbitrary matrix A by E gives, 
by (120), 

AE=EA=A (121) 


Let us now determine the matrix which effects a restitution of a 
vector x that is transformed by A into x’. We denote this matrix by 
A and call it the reciprocal of A. Because of its definition A must 
satisfy the condition 


ATA =E — (122) 


This equation is not solved as easily as we might imagine. The 
matrices on the right and left must agree in all their elements so that 
we obtain the infinite number of equations 


, Ofori ~k 
Yau n= on dn = 4 ay hy (123) 
l 


where the elements of A~! are designated by a;’._ These can be solved 
only if the determinant of the a, elements differs from zero. 

When we multiply (122) by A to the left we obtain AA~'A = A. 
By interpreting the left-hand side as (AA~")A, we see that AA~* also 
equals E. The multiplication by the reciprocal is commutative just 
as in the case of multiplication by EZ. 

Very often it is useful to give a vector the form of a matrix in which 
the first column contains the components z of a vector x, the other 
elements being zero; thus 


Sxc. 23] ADDITION AND MULTIPLICATION OF MATRICES 89 


Xo 0 0 
Ty 0 0 


2 0 0 


Equation (118) then can be written 
x’ = Ax (124) 


if Ax is understood to be the product of two matrices A and x. If for 
the time being we denote the elements of x by x,%, the z;9 alone differ- 
ing from zero, the meaning of (124) is 


, 
ry = 2, = Y aixteo = Y aute 
i 


which is in agreement with (118). 

We make use of this formal possibility of expressing a vector as a 
matrix to solve the following problem. A transformation may be 
defined, with reference to a given coordinate system K, by a matrix 
A =||ax||. The matrix transforms a vector x into another vector y 
which is related to x by 

y = Ax 


When we introduce another coordinate system K’, the components 
x; and y; of x and y are changed to 


z;' = > sine and =’ = > save 
which we may write in the form 
x= Sx and y’ = Sy (125) 


where S is the matrix formed by the s,, coefficients. The problem is 
to find the linear relations between the 2,’ and y;’ terms, that is, to 
determine the matrix A’ which transforms x’ = x into y’ = y if x’ 
and y’ are referred to K’. 

To answer the question we first solve equations (125) with respect 
tox and y. Multiplying each by S~!, keeping in mind that S~4S = 
E, we obtain 

x= S';’ and y = Sly’ 


When we introduce the expressions into y = Ax, we obtain 
So ty’ AST’ 


90 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


or 
y’ = SAS"x’ 


from which we can infer that A’ = SAS~1. Thus we can state the 
theorem: when the coordinate system is changed from K to K’, the 
relation between these being given by x’ = Sx, a matrix A is changed 


into A’ when 
A’ = SAS“! (126) 


A and A’ represent the same transformation but are referred to differ- 
ent coordinate systems. 

24. Dual, Unitary, Hermitean Matrices. When in a matrix A 
the rows and columns are interchanged, that is, when element a, of 
the 7th row and kth column is placed where the kth row intersects 
the ith column, the matrix A is changed into another matrix A which 
is called the dual of A. Thus the definition of A is G, = ayz;. When 


the product P = BA is subject to this operation, we get P = BA, with 
the elements 


Dik = Pri = > evans - Y aubn 
l l 
Therefore bi 
BA = AB (127) 


The concept of the dual matrix makes it possible to represent the 
scalar product of two vectors x and y in the form of a matrix product. 
If we understand x* to be the matrix 


Xo* 0 0 
tr | 0 0 


and hence 


x* 


ll 
Seo 


then the product x*y is a matrix in the upper left-hand corner of 
which is the element 7 x;*y; = x+y. All the other terms are zero. 
Thus there is a unique correspondence between the scalar product 
x+y and the matrix x*y. This affords a very simple way to derive 


Sc. 25] TRANSFORMATION TO PRINCIPAL AXES 91 


the condition that must be fulfilled by a matrix S which is required to 
effect the transition from a normal coordinate system K to another 
normal system K’. We know that this transition leaves the magnitude 


of the vector x invariant so that , x;*n; = > a *z,!. This condition 
can be expressed by ay viii 

x*x = x’ *x’ (128) 
Now x’ = Sx, and thus x’* = S*x* if S* represents the matrix whose 
elements are the conjugates complex to those of S. According to 
(127) the dual matrix of x’* = S*x* is x*S*. Hence (128) can be 
written ey vines ins 

x°x = x75 *Sx 

from which we see that a unitary transformation S must satisfy the 
condition 


S*S =E (129) 


When we multiply the last equation on the left by S, we get SS*S = 8 
and therefore SS* = E. If we multiply on the right by S~! we find 
that S* = $7}. 

Of special importance in quantum mechanics are the so-called 
Hermitean matrices which are defined by A = A* or As, = az". 
Throughout this text we shall use the notation A for A*; and thus a 
Hermitean matrix will be given by A = A. If A is Hermitean, the 
scalar product of x and y = Ax is 


Ry, oe > Ln*Yn = > ta*anate 
n nk 
This expression is called the Hermitean form of x and assumes only 
positive values, for, since aj, = a,;*, two terms Gng0n*x, and Apn%p*Ln 
are conjugate complex and hence their sum is real. 
It is readily seen that x- Ax = Ax~-x, since (113) gives for the 
latter product 


Ye", = > Gin *2n*tn = Y tn*dnnte 


25. Transformation to Principal Axes. In order to get some 
idea about how to continue the mathematical procedure, let us recall 
the quantum-mechanical principle of correlating a transformation to 
any physical entity or, as we shall write from here on, to any observ- 
able. When the transformation is applied to the vector x, we obtain 
another vector Ax which in general differs from x in both magnitude 
and direction. However, certain vectors do exist, called the eigen- 


92 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


vectors of the transformation, which are changed by the transforma- 
tion into a multiple a of themselves, so that x and Ax differ only in 
magnitude. Hence to each of the vectors we can assign a certain 
eigenvalue and a certain direction called a principal axis of the trans- 
formation. The fundamental idea of quantum mechanics then is 
this: when on a given system we carry out a measurement of an observ- 
able belonging to the operator A, the result must always be an eigen- 
value aof A. Just which particular a will be observed depends on the 
direction of the vector representing the state. When x happens to 
coincide in direction with a principal axis, that a which is the eigen- 
value for that direction will be the certain result of the measurement. 
For all other cases there is for any a only a certain probability which 
can be calculated from the direction of the representing vector. Thus 
quantum mechanics is interested primarily in the eigenvalues of the 
transformation, and as a consequence our most important task will be 
to work out a method which enables us to determine the eigenvalues 
and the directions of the principal axes for a given transformation A. 

This problem is solved on the supposition that the matrix A is 
Hermitean, this type being of special interest in quantum mechanics, 
and also on the supposition that the vector space has a finite number of 
dimensions. Although it is true that in reality we are concerned with 
spaces having an infinite number of dimensions, it seems probable that 
the method to be developed will hold for that case as well. 

In (n + 1)-dimensional space we lay out a normal coordinate sys- 
tem. Relative to K, the components of x may be given by zo, 2, 


%2, ‘* * , and the Hermitean A may be defined by 
G00 401 Go2 *** Gon 
M10 G11 G2 *** Ain 
A= 
Gnd Gni Ong *** Ann 


The problem is to find another coordinate system K’ such that in the 
transition K — K’ the matrix A is converted to the diagonal matrix 


a 0 0 a+ T8raQ) 
Ou Maz 0 SS TRG 


0 0 BOI HG 


Ave (130) 


Src. 25] TRANSFORMATION TO PRINCIPAL AXES 93 


If we succeed in finding such a system we have solved the problem and 
can consider the a terms the eigenvalues, and the axes of K’ the princi- 
pal axes of A, for the equation y = Ax when referred to K’ changes 
into y’ = A’x’; that is, 


, / / , / / 
Yo = AX ae Oe oe Yn = AnXn 


Thus a vector x’ with the components 29’ = a, wl Sieg eure 
= 2,’ = 0 is transformed into y with components yo’ = aga, yi’ = 

yo! = *** = Yn’ = 0, a vector having the direction of the first axis 
of K’, and changed only in magnitude. Hence x’ is an eigenvector 
of A belonging to the eigenvalue ao, and the same is true for every 
vector having the direction of any axis of K’. 

If eigenvalues exist for A at all, there must exist also a system K’, 
but of course there is no a priori certainty that K’ is normal, so that, 
in the transition K — K’, |x|? is transformed into |x’|”. However, it 
can be shown that this is the case when A is Hermitean, so that 
A =A. Here we can achieve the diagonal form (130) by performing 
a unitary transformation. To prove this first let us consider the nor- 
mal coordinate system K and determine a vector x(%o, 21, * * * , tn) 
which differs from the vector y = Ax in magnitude only, that is, Ax = 


ax, or, since the components of Ax are given by y; = > AKL, 


vi =) ainty = ams = 0;1,2,-°°) 
This means that we must solve the homogeneous equations 
(ao0 — @)ao + aoiti + + + * + Aontn = O 
Qyo%0 + (11 — ea, + ++ * + int, = 0 
(131) 
Gnoto + Oniti + + * * + (Ann — @)tn = 0 
in which a, 21, 22, * * * , %m are considered to be unknown but the 


a,, terms are given. These equations can be solved only if the deter- 
minant equation below is satisfied. 


Goo — @ 4o1 pee eo. ee 

10 (@11—@) *** Gin 

, ae 
ano (Gnn = a) 


94 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


This is an equation of the (n + 1)th degree in a, and accordingly there 
arensolutions. Letoneofthesebeao. Introducing this value in (131), 
we can solve these equations for a vector xo with components 2, 
21, ** * , 2, and we can construct a coordinate system Ko so that 
one of the axes has the direction of xo. When the transformation A 
is referred to this system, A is changed into the matrix A’ = ||a;x'||, 
which transforms a vector to’ = a, t}' = %o' = +++ = 2,’ = Ointo 
Yo = ad, yy’ = yo’ = +++ = yn’ =0. Thus the following relations 
must be satisfied. 


yo = Y cow =A0a=aa yi =aa=0 yn’ =An'a =0 


From this it follows that aoo’ = ao, aio) = > * * =Qno' = 0. It is 
easy to prove that the ao;’ terms must vanish with the ajo’ ones, _ 
if S is the unitary matrix which effects the transition K —> Kok 
x’ = Sx, then, according to (126), A’ = SAS! and A’ = S48, 
Now A is Hermitean and S is unitary, so that A = A, § = S7}, 
and S-! = §. Consequently we obtain A’ = SAS~! = A’, and thus 
A’ is also Hermitean. This means that ao;’ = ajo’* = Ofori # k. 
Thus the transformation A when associated with the system K > is of 
the form 


a 0 0 mis Ny 
Da Gag Oa 
O Gai’ dea’ *** Gan’ 
Pape Meno , 
DiiGay + Cus ale ”* pew 
We refer now to the system Ky and seek a vector x;(ao'x1' + * * 2’) 


which must satisfy two conditions: 


(i) It must be perpendicular to xo. 
(ii) It must represent an eigenvector of A. 


The first condition requires that xo’ = 0, for the scalar product (xox) 
equals zero only if this is so. The second condition gives us the n 
equations: 


(aay? — aay! + aya! te! + + o> + ah! an’ we 
(132) 


Gni' x1’ + Gno’te’ +e + (Gan’ Ta ) En! =0 


Szc. 25] TRANSFORMATION TO PRINCIPAL AXES 95 


Using these equations, we repeat the procedure to determine a, for 
which the determinant equation is 


, , , 
a1 —Q a2 Sea Gin 
, , , 
de1 Gsq — @ °° *. Gon 
=0 
, , , 
Ani an2 Sy ten Oa SS 


and with a = a; we obtain a solution x; from (132). We then rotate 
the Ko system about the fixed xo axis until a second axis coincides 
with the direction of x,, this being possible since xo and x; are per- 
pendicular. In this way a coordinate system K; is established relative 
to which the transformation A is represented by a matrix: 


a 0 O * KD 
O° “ays 0 ES AQ 


OO Gag’) aay” 
git a Ee 


0 0 Ano’ 8 SNS aint 


It is obvious that a continuation of this procedure ultimately will 
furnish a normal coordinate system for which A has the form of a 
diagonal matrix. Thus it can be asserted that in the case of a Hermi- 
tean transformation A there exists always a coordinate system the 
axes of which coincide with the principal axes of A. 

If the eigenvalues all differ from each other, no other system besides 
K exists relative to which the transformation can be effected by a 
diagonal matrix.t If, however, two or more of the a terms agree, the 


+ Speaking more precisely, there is no system that differs from K in the directions 
of the axes. A certain arbitrariness, however, is left in the choice of the unit 
vectors, eoe1.- - - en which are required for the complete specification of the 
coordinate system. A given vector x is related to the e; vectors by x = xoe9 + 
- ++ ++ 2nen, from which we see that the components 2; of x are determined by 
the e; vectors. The unit vectors e; must satisfy the condition |e]? = 1; but this 
condition remains fulfilled when we multiply e; by a factor of the form e*¥*, From 
K we then get another system K’ relative to which the vector has other components 
2;', which are related to the z; components by the equation 


x = 2e0 + oes + Zn@n = Xo'eo’ + wee + Zn'en! 
With e/ = ee*?‘, we get from this equation z;/ = ze~*, This corresponds to a 


6 | WAVE MECHANICS IN MATRIX FORM [Cu. 3 


K system is not uniquely determined because that part of the system 
which has the same value for the a terms is capable of a rotation within 
the space associated with a common @ without A losing its diagonal 
form. For example, let us assume ap = aj = ag =a. Then the 
transformation y = Ax in the principal system K is described by the 
equations yo = aX, y1 = aX, Yo = ate, * ++. Now we replace K 
by another system K’ by exchanging the first three axes of K for 
three others, keeping the remainder, the condition being fulfilled that 
the three new axes are not only perpendicular to one another but to 
all the rest as well. The transition K— K’ is described by the 
equations 
Zo = So0%o + $0121 + So2ve 


= 810%o + 81141 + Si2%2 (133) 
$20%0 + S21%1 + Soot 


with the components from x3 upward being unchanged. The com- 
ponents 29, 21, and x2 are changed by the operator A in the ratio a. 
Because of (133), this means that xo’, x1’, x2’ are changed in the same 
ratio, so that the matrix A is diagonal whether referred to K or K’. 

If the a eigenvalues are not all different one from the other, the 
system is degenerated. Hence a degenerated A does not define the 
principal coordinate system uniquely but permits a unitary rotation of 
the system within the space of degeneracy. 


& 
i) 
ll 


transformation of the form 


en 0 0 
- 0 e772 0 
0 wes «ere etn 


Thus, in the case of a different a, all coordinate systems relative to which A appears 
as a diagonal matrix coincide in the directions of the axes but differ by a unitary 
matrix of the kind represented above. 

{ In order to determine the matrix S corresponding to such a rotation we assume 
that the eigenvalues ao, a1, - - - occur 70, 71, 72, + - + times respectively. By 
S87, S", S72, - - - we denote unitary matrices consisting of ro, 1, r2, - - - rows 
and as many columns. Then we may maintain: If all these matrices are amalga- 
mated into one matrix in such a way that S”, S", 8" are arranged along the 
diagonal, as 


s* 0 0 
Ba )O! > Bhd 


0 0 Se 


the matrix A retains its diagonal form when the principal coordinate system K is 
changed, with the help of the above, into another system K’. The reader will 
have no difficulty in verifying this statement by applying (126). 


Sxc. 25] TRANSFORMATION TO PRINCIPAL AXES 97 


The theorem concerning the transformation to principal axes can 
be formulated in yet another way which brings in the Hermitean form, 
Ax: x defined in Section 24. As we know, a unitary operator which 
changes x and y into x’ and y’ respectively leaves the scalar prod- 
uct x* y = ) &n*yn invariant, that is, x* y = yatye = x/ey' = 
) 2n'*yn'. Therefore the transition to the principal system will change 
Ax: x into A’x’* x’ or y Ojpt;*2, INtO aoTo! *Xo’ + ayey’*ay’ ++, 


giving us the theorem: Any Hermitean form Y ants *x,, can be changed into 
aoty! *2o' + ayary'*xy’ + + + + by a transformation to the principal axes 
of A. 

A question of importance in quantum mechanics is that which seeks 
to learn under what conditions two Hermitean operators, A and B, 
can be made diagonal simultaneously, that is, by the same trans- 
formation. It is easy to show that this is possible only for two matrices 
which are commutative, so that AB = BA. Let us assume that K’ 
is a coordinate system in which both operators are of the required 
form. 


The A’B’ is a matrix with the elements 


a6; fori=k 


Peis ) aubu Tug “Yeree% 


Thus A’B’ is diagonal with p;; = a,8;, and exactly the same holds for 
B’A’ since in that case we simply interchange a; and §;. Therefore 
A'B’ = B’A'. If, now, S is a matrix corresponding to the transition 
K — K’, then, according to (126), A’ = SAS~! and B’ = SBS™', so 
that the equality A’B’ = B’A’ can be written 


SAS—1SBS—! = SBS“1SAS“ 
and, since S~1S = E and AEB = AB, we have 
AB = BA 


And so, to have a simultaneous transformation to principal axes, A 


98 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


and B must be commutative. Although it is not difficult to prove 
that this condition is sufficient as well, we shall accept it without proof. 

26. Functions of Matrices. When a group of matrices, A;A2 
- + + An, are given, new matrices can be formed from these by addition _ 
and multiplication. By so doing, we get functions of the A; matrices. 
The simplest examples are the sum, S = A; + Ax, and the product, 
P = A;A,. Functions of a general type are obtained when the A; 
matrices are multiplied with one another in an arbitrary succession, 
and these products, when provided with coefficients, are united as 
polynomials. In such a manner we obtain the function 


F(A, abat An) ad Y carte ind nda aka tii Ain 


where 71, 72, * * * , %m signify equal or different numbers of the se- 
quence 1, 2, 3, +++, m. When in all terms of F the succession of 
factors is inverted and the conjugate complex values c* are substi- 
tuted for the coefficient c, we obtain 


G(A1 +++ An) = D, chase sin ® A fA inns es ee 


which is called the adjunct function of F. If G and F are identical, F 
is said to be adjunct to itself. For example, F = A,” + A.?+--- 
A,” is such. We shall note here a theorem which will be used later: 
if AyA. +: - A,» are Hermitean and F is self-adjunct, F is Hermitean 
also. Indeed we can write 


F = Y esata. tm” Ae See Ai, = ), catassrtm Ait fut 9 Aj, =i =F 


When we transform from a system K to which the matrices are 
referred to another system K’, carrying out a unitary transformation, 
the matrices are changed into 


Ay’ = SA,S“ A,’ = SAS"! 


and so on. From this it follows that the product A;,A;, °° +- Aim 
becomes 


Ay'Au! > + * Ain! = SAS "SAS >> - = SAA * + * ALS 


Thus an arbitrary function of matrices F is transformed by S just as a 


single matrix into 
F’ = SFS“ (134) 


Operations having the nature of differentiation can be performed on 
functions of matrices, and these processes can be defined in different 


Sec. 26] FUNCTIONS OF MATRICES 99 


ways. For the purposes of quantum mechanics it is convenient to 
define the derivative of a function as follows: If F is a function of the 
matrices X, Y,Z, + + + , the derivative of F with respect to X is 


eodeepynt dda ener hok, 2Botov dit Ne oe W ) 
aX : a=0 a 


where aH is the unit matrix multiplied by a. The ordinary derivative 
is given by 


of 


Een late g ge 


a 


from which we see that the only difference between the derivative of 
F(XYZ) and the ordinary one is that X is increased by aH and not 
simply by a, which would be meaningless since a matrix cannot be 
added to an ordinary number. Accordingly, if F = X, 


FX toh ~X | 


ax ~ ¥ 
If F = XX = X?, 
2_ y2 

OF lim 2 tT p.% _ ox 

ox a 
HF. =X YZ, 

OF im A teh YS - XYZ _ 

ox a 


If F; and F»2 are two different functions of XYZ, it is clear that 


te) 
ax 1 +F2) = OX Tay 
Furthermore 


F(X + aB)Fo(X + ak) — Fi(X)F2(X) 


é , 
ax (FF 2) = lini a 


The numerator can be put into the form 
F(X + aB)[F2o(X + ak) — F2(X)] + [Fi(X + ak) — Fi(X)]Fo(X) 
Therefore 


O(FiF2) _ oF i dF 2 
ax = ox eth (135) 


100 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


A matrix, then, can be a function of other matrices. It can also 
just as well be a function of an ordinary variable such as time. This 
case arises if the elements of A are functions of time; aj, = a(t). 
The derivative dA/dt is defined by the matrix ||da;,/dé||. For exam- 
ple, if we consider the vector x to be a matrix which contains the ele- 
ments of x in the first column, all the other elements being zero, then 
dx/dt is the matrix 


dxo 

ad 0 0 
dx divs 0 0 
dt ca dt 


The preceding matrix calculus is not sufficient for all applications to 
quantum mechanics. However, it is desirable to defer the remaining 
requirements until the occasion for their use arises later. At this 
point we wish to return to the field of physics in order to carry out the 
plan mentioned in Section 21. 

27. The Quantum-Mechanical Interpretation of Matrices. 
In contrast to classical theory, it has been made clear that quantum 
mechanics holds it to be impossible to determine with any desired 
accuracy the state of a system as far as all observables are concerned. 
If p and q are two canonically conjugate observables, a measurement of 
q cancels the result of an antecedent measurement of p and vice versa, 
thus making it impossible to arrive at an exact knowledge of both p 
and q. Hence, according to quantum mechanics, it is meaningless to 
imagine a system to be described by specifying the values of all 
observables. No experiment could justify such a description. The 
only way in which we can proceed is as follows. We consider a state 
which is subject to certain physical conditions. What are we able to 
find out about the state of the system? As we know, there is the 
difficulty that one measurement is sufficient to destroy the state 
and thus all further measurements will be useless. Thus, as long as 
our investigation has to do with one system only, the results will 
consist of a single datum. However, the state we are investigating 
can be realized for a great number of systems which are exposed to 
exactly the same conditions. Then we can make as many measure- 
ments as there are systems and work out a statistical ensemble that 
will inform us about the probability of finding a given value of an 


Sc. 27] INTERPRETATION OF MATRICES 101 


observable. We assume that the same statistical array holds for any 
sufficiently large partial assemblage. Thus the conditions are such 
that they guarantee any system to be in the same state. The state 
is then called ‘well-defined in the quantum-mechanical sense,” and 
we consider it as completely described by the statistical array which 
contains everything that experimental possibilities permit to be 
known about the state. 

Thus the new mechanics is satisfied with a definition of the state 
that is far short of the requirements of classical physics. — It is possible 
that the value of not even one observable is known with certainty, 
and yet the state possesses a maximum of definition. Of course, 
there are states too for which the values of one or more observables, 
such as energy and angular momentum, have certainty. But it is 
impossible that this apply to all observables, for in the case of two 
canonically conjugate observables such as coordinates and momenta 
a simultaneous exact measurement is impossible. 

It is the purpose of quantum mechanics to represent any state of a 
given system by a unit vector in a Hilbert space with an infinite 
number of dimensions and to interpret as probability amplitudes the 
components of x along the axes of certain coordinate systems, the 
probabilities being set equal to the squares of the lengths. As for the 
coordinate systems, it is assumed that any observable is representable 
by a matrix operator and that the coordinate system associated with an 
observable is defined by the principal axes of the operator. At first this 
may seem a rather strange procedure, but it is easy to realize that the 
method is exactly adequate to the possibilities of observation. 

Let us consider a system the energy of which can only assume values 
Eo, Ei, - - : , and which is supposed to be not degenerated as far as 
the energy is concerned, so that there are not two different states of 
the same energy. The conditions to which the system is subject may 
be such that the energy value EZ; may be expected with the probability 


|r|”, (> ||? = 1). In the Hilbert space we then can scaffold a 


normal coordinate system K, the axes of which can be correlated to 
the different Z,, and construct a vector the components of which 
relative to K are given by 2, = |z,\e”*, the 6 terms being constants 
which may be left undetermined for the present. 

We now correlate a linear operator with any observable of the 
system, that is, with any coordinate in general such as the coordinates 
and momentum components of the electron belonging to an atom. 
When referred to K, the operator shall be described by a Hermitean 
matrix. We shall see later that only Hermitean matrices enter into 


102 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


this question. We demand that the eigenvalues be given by the possible 
values of the observable under consideration. At the moment we are 
not interested in how these operators or matrices can be determined. 
We are concerned only with the method and assume that we already 
know the operator for any observable. For example, in the case of a 
linear oscillator the operators associated with coordinate, momentum, 
and such of the oscillating particle may be the matrices Q, P, - - - 
respectively. Then every one of these defines a normal coordinate 
system K’, K”, - + - by the directions of its eigenvectors. The postu- 
late of quantum mechanics then is: The vector x, which, when referred to 
the energy system K, has the components xz, will have, when referred to 
the systems K’, K", - - + , the components x,', xx’, + + + which by the 
values |2x,’|", |xx’|", - + - determine the probabilities which a measure- 
ment of observables under consideration provides for the eigenvalue 
belonging to the kth axis of the corresponding coordinate system. 

We shall use the theorem first to fix the phase constant 5; which we 
left undetermined. From the statistics of the energy values we can 
infer only the |r? values, so that measurements of the energy alone 
are insufficient to determine the vector x uniquely. To understand 
this we must recall the possibilities for the simultaneous measurement 
of two conjugate quantities according to the uncertainty relations. 
If the measurement of one of them is inexact, this lack of exactness is 
compensated by the possibility of measuring the other within. certain 
limits of accuracy. Thus a state of which the energy is not exactly 
known can be considered as well-defined only if, together with the 
energy, the conjugate coordinate (the time ¢ as the phase of the motion) 
also is measured with the maximum of accuracy which accords with 
the uncertainty relations. This is why the terms |z;|” do not suffice for 
the unique determination of the vector x. Besides the energy we 
must take into account the phase (or another observable that com- 
mutes with the phase) in order to define the state uniquely. This 
means that the values are to be chosen in such a way that the com- 
ponents of x in the coordinate system associated with phase provide 
the correct probability amplitudes of the phase. Only then is the 
vector x fixed uniquely and, if quantum mechanics is correct, must now 
furnish automatically the correct probability amplitudes for all other 
observables of the system. 

The phase becomes meaningless only if the direction of x coincides 
with an axis of K, that is, if for a certain k, |-ry|? = 1, all other x; being 
zero. Then the state corresponds to a certain energy E = E;, and 
the components of x in the phase coordinate system are all equal since 
the phase can assume any value with the same probability. In this 


Sxc. 28] THE COMMUTATION RELATIONS 103 


case x is determined uniquely by ||? = 1 except for a factor e® 
which has no physical meaning. 

28. The Commutation Relations. We have seen that in quan- 
tum mechanics there is correlated to any observable of a system a linear 
operator which transforms a vector x into a vector y. The relation 
between the observable and the operator is this: when the vector x 
with components 2; in the arbitrarily chosen reference system K has 
the components 2,’ in the direction of the kth principal axis of the 
operator, the probability that a measurement of the observable will 
furnish the eigenvalue a; belonging to the corresponding axis is given 
by 2;/*z;’. The values a, represent the possible results of the meas- 
urement, and if x should have the direction of the axis we may expect 
a, with certainty. From this interpretation of the formalism we can 
infer at once an important relationship between the matrices associated 
with two canonically conjugate observables. If we designate two such 
matrices by P and Q, associated, for example, with the coordinate 
and momentum of a linear oscillator, and by K’ and K” the corre- 
sponding principal systems, we can maintain that the axes of K’ cannot 
coincide with those of K”’. If this were so, a vector lying in the com- 
mon direction of two principal axes would correspond to a state which 
is sharply defined relative to both of the observables simultaneously. 
This would be a contradiction to the fundamental postulate of quan- 
tum mechanics. Accordingly we can infer that two matrices P and Q 
belonging to canonically conjugate observables cannot be transformed 
to principal axes simultaneously. According to Section 26 this con- 
clusion can be expressed by QP — PQ #0. This statement implies 
that the two intervals of accuracy, Ag and Ap, within which the values 
of g and =p are found by measurement, can, by no measuring arrange- 
ment, be confined simultaneously to zero. According to the uncer- 
tainty relations under no circumstances can the product Ag Ap be 
made less than h. In quantum mechanics this fundamental postu- 
late is expressed by the so-called commutation relations, 


QP — PQ = *E (136) 


To prove this we shall consider the two matrices Q and P as represent- 
ing coordinate x and momentum £ respectively of a particle moving in 
the x direction. Since both these quantities may assume any value 
between —o and +, each of the operators possesses a continuous 
set of eigenvalues and as a result there is a continuous infinity of 
principal axes forming the systems K’ and K”’. Therefore the com- 
ponents of the vector x can have values relative to K’ and K” only in 


104 WAVE MECHANICS IN MATRIX FORM [Cx. 3 


the form of functions u(x) and w(é). We have used these functions 
previously wherein u*u and w*w were defined as probabilities of the 
values x and & This is in accord with the present interpretation of u 
and w as components of x. 

Now let us consider the transition from the principal energy system 
K to the x system K’. This transformation is effected by a transforma- 
tion matrix S = ||s,,||._ If the components of x in K and K’ are repre- 
sented by a, and 2,’ [x,’ is identical with w(x)], then x’ = Sx or x,’ = 


ri 8,2; In the transformation Q and P become 
Q’ = SQs“ and P’ = SPs 


These new matrices have a peculiar nature in that they refer to coordin- 
ate systems with continuous sets of axes and thus themselves have 
continuous structure. When they act as operators on x, which has 
components u(x) in K’, the vector is transformed to Q’x and P’x 
respectively. Now, according to Section 26, we obtain Q’x and P’x by 
making use of the multipliers x and —(h/7)(0/dx) respectively. 
Hence Q’x is a vector with components xu(x), and P’x is one with com- 
ponents —(h/i)(du/dx). Furthermore Q’P’x is the vector obtained 
when P’ and Q’ operate on xin that order. In this case the components 
of Q’P’x are —2x(h/7)(du/dx), whereas the components of P’Q’x are 
—(h/t)(d/dx)xzu. Hence (Q’P’ — P’Q’)x is 


i = vs a fox 
t t t 
or 
h 
Q'P’ et P’Q’ = 7B 
Now 
Q’=SQS* and =P’ =sps 
Therefore 
S(QP — PQ)S1 = *E = SES“? 
and 
h 
QP — PQ = 7 


We see that this proof eventually reverts to the uncertainty relations, 
for the assumption that x and é are represented by the operators 2 
and —(h/i)(0/dx) is based on the relation 


u(x) = [ e(eeom? dé 


Sc. 28] THE COMMUTATION RELATIONS 105 


However, we were led to this relation by the desire to represent the 
state of a particle by a wave packet, as only in this way can we give a 
description that satisfies the uncertainty relations. 

Thus the commutation relation (136) is nothing but the translation of 
the uncertainty relations into matrix notation, and consequently we may 
be certain that tt holds for any pair of canonically conjugate observables. 

On the other hand, let us consider two observables which can be 
measured simultaneously with any desired accuracy. In this case it 
must be possible to transform the corresponding matrices simul- 
taneously to principal axes, for there exist states the vector of which 
coincides simultaneously with a principal axis of A and one of B. 
This means that the matrices A and B are commutative so that AB — 
BA =0. The coordinates xz, y, z, are examples of observables which 
can be measured simultaneously. The same is true of the momentum 
components &f. If we denote the corresponding matrices by Q:, Q,, 
Q, and P,, P,, P., then 


Q0. — 2.2; = 0 and P;P;, — P,P; = 0 (i,k = 2, y, 2) 
Moreover we can write 


QP,—-P.Q=0 (hk) 


because, for example, « and 7 can be measured independently. 

The general meaning of the commutation relations can be expressed 
as follows: According to quantum mechanics the relation between two 
observables depends essentially on whether the possibility of a simul- 
taneous measurement is given. If given, the corresponding matrices 
must be capable of a simultaneous transformation to principal axes, 
that is, they are commutative. If not given, QP # PQ. For the 
special case of canonically conjugate observables, QP — PQ = (h/i)E. 

If two hermitean matrices Q and P satisfy (136), we shall call them 
canonical, implying by this that the matrices are correlated to each 
other in the same sense as two canonical entities of classical mechanics. 

It is readily seen that canonical matrices have to be infinite, since 
for finite matrices the sum of the diagonal elements, the so-called 


“spur” of QP —PQ, would be } (gps: — pages) = 0, whereas 


according to (136) the sum should be n(h/z), where n is the order of 
the matrices. Thus (136) can be satisfied only by matrices for which 


Y, upes — DikQki) #0. This is possible only for infinite matrices, 
for then there is a difference depending on whether the summation 
QikPxi is carried out first over 7 or over k. 


106 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


29. Hermitean Forms and Expectation Values. When we 
represent the state of a system by a vector x of components «7; relative 
to the coordinate system K of the energy, the result of an energy 
measurement is expected to be HZ; with the probability 2,*z,. Thus 
the state x(a, 71, - - - ), as far as energy is concerned, corresponds to 
the expectation value 


E = > mt oel (137) 
according to which, if x has the direction of the kth axis of K, 


E=E, 

Let us further consider an arbitrary observable a corresponding to 
the Hermitean matrix A. Let us denote the eigenvalues of A by 
@o, 1, @, * * * and let the normal coordinate system scaffolded by 
the principal axes of A be K’. Then the a; termsrepresent the possible 
results of a measurement and are found with the probabilities 2,/*2,’. 
The x,’ terms can be evaluated from the x; terms by a transformation 
matrix S which performs the transition K — K’ according to x’ = Sx 


or 2 = y 8:40. Thus the probability 2,/*z,’ can be written 


2; *2,! = > sin *e*8imtm 
km 
If the vector x signifies an energy state, H = H,, that is, if all the 2; 
vanish with the exception of |a;|" = 1, we have 


/ , 2 
a,'*x; = sn,*six = |sin| 


Therefore the matrix S can be given the following definition: The 
elements of the kth column, s;x, give, by |s,|”, the probabilities that a 
will have the value a; for a state of energy Ey. 

There is yet another important theorem. Again consider an arbi- 
trary state x(%o, 71, - - + ) and determine the expectation value of a 
quantity a. This can be expressed by 


= > x5! 


According to (118) this can be interpreted as the scalar product of two 
vectors the components of which, relative to K’, are 2,’ and 2;’a 
respectively. The first vector is simply x, and the other is identical 
with Ax, for the operator A has the effect of changing the x,’ terms into 
a2’ because the axes of K’ are the eigenvectors of A. Thus 


& = (x,Ax) = > tain (138) 


Sec. 29] HERMITEAN FORMS AND EXPECTATION VALUES 107 


This equation shows why a Hermitean matrix must be correlated to 
any observable. Let us assume first that x has the direction of an 
energy axis. The sum reduces to a;;7;*2z;, and the diagonal elements of 
A must be real since, by its very nature, the expectation value of a 
is always real. Furthermore, if x has only two components z; and 
x, Which are different from zero, we have 


G@ = ajyaj*xy + Opty * a; + ageu;*a; + Opete* re, 


and this sum is real only if a;, = a,;*or A = A. Therefore a physical 
entity can be connected only with a Hermitean matrix and we can 
state the following theorem: If A is the matrix belonging to the 
observable a, the expectation value of a for the state represented by 


the vector x is given by > a;%;*zy. In this theorem, also, the expecta- 


tion value £ of (137) is implied, since, relative to the system K which 
consists of the principal energy axes, the operator of the energy is 
defined by the diagonal matrix, 


Ey 0 0 
0 E, 0 


0 0 E>» 


so that > On0;*2, = > Eiyc;*2;. 

Up to this point the energy has been distinguished by defining the 
vector x in terms of components relative to the principal system K of 
the energy. It goes without saying that this preference for a certain 
observable is quite arbitrary and appears justified only because of 
the special importance of the energy. We could, as well, define the 
vector by its components 2;’ relative to the system K’ of any other 
observable. Then the matrices have to be changed into 


A’ = SAS“! 


S being the unitary transformation matrix corresponding to the 
transition K — K’. The energy operator H then loses its character 
of a diagonal matrix and transforms into H’ = SHS~1, with the 
elements 
hi! = > Simmn8nk | = >, simBntma? 
mn m 

The formulas for the expectation values are not changed by the 
transition to another coordinate system, since @ is given by the scalar 
product x* Ax, which remains invariant and is expressed in the K’ 


108 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


system by 
x* Ax =x'* A’/x = Dain’ *2y! 


Thus the same physical entity is represented by different matrices 
which depend on the coordinate system to which the representation is 
referred and which are connected with each other by unitary trans- 
formations. A unitary transformation does not destroy the Hermitean 
character of the matrices. Together with A, A’ = SAS! is also 
Hermitean, for, since 


§S=s7} and S1=8 
A’ = S48 = SAS“! = A’ 


30. Coordinate Systems with a Continuous Infinity of Axes. 
The Passage from Matrix to Wave Mechanics. We are con- 
fronted with a peculiar situation when the representing vector x is 
referred to a system K which is the principal system of an observable 
with a continuous infinity of eigenvalues. Examples of such are the 
coordinate x and the momentum £ of a particle, for the measurement of 
these may furnish any value between — © and +2. When the repre- 
sentation is referred to such a coordinate system, matrix mechanics takes 
on the form of wave mechanics. 

In what follows, x signifies an observable which can assume any 
value between —© and +. Then K consists of a continuity of 
axes, all of them belonging to a certain value of x. If the vector x 
has the direction of any of these axes, a measurement of x furnishes the 
value belonging to the axis, and we can say that the vector has the 
“direction x.” For every axis let us define a vector e;, the sub- 
script denoting its correlation to the eigenvalue. (We can denumerate 
the axes no longer.) As the axes of K are orthogonal to one another, 
the scalar product of two vectors e, and e, must vanish, that is, 
€,* ez, = 0. This corresponds to the second equation of (117) in 
Section 22. The first equation, |e,|” = 1, no longer holds, for if it 
did we could not possibly arrive at an appropriate formalism. To 
arrive at a simple theory the vectors e, must be chosen in such a way 
that the integral of e, * e, over a domain surrounding z, however small 
this domain be chosen, has the value unity. Thus 


if (e2* ey) da’ =1 (139) 


As e,* €, = 0 for x’ # x, the integral may be extended from —& to 
+o. The usual notation for the equation is 


e€z,*° ey = d(x — 2’) (140) 


Szc. 30] PASSAGE FROM MATRIX TO WAVE MECHANICS 109 


6(z — x’) being defined as an improper function of x which vanishes 
when x’ is not equal to z. When x equals 2’, however, it becomes 
infinite in such a way that the integral has the value unity. The 
transition from matrix to wave mechanics is carried out in such a way 
that, in the case of a system K with a continuous infinity of axes, 
equation (140) replaces (117). 

In order to develop the new formalism let us consider the vector x. 
Designate its components relative to K by ¥(x). This new function 
¥(x) is used now instead of xox - - - , since the components are no 
longer denumerable. Accordingly the vector x will be represented by 
the equation 


= | ex(x) dx (141) 


which takes the place of the equation x = > mei. Because of (140) 
this gives us, for |x|”, 

ef = ate = [v@ dx : V(x’) da’ e2* ex 
[v@ ax [ ve) o@ - 2) de! = f yryae 


ll 


This means that we can interpret y*y dx as the probability of x having 
a value between 2 and « + dz so that we can normalize x by the 
requirement | w*y dz = 1. 

Now in addition to K we introduce another system K’ associated 
with an observable £ which also has a continuous character. We shall 


denote the unit vectors of K’ by e; and the components of x relative to 
K’ by xz. Then we have 


x= [ ev) dz = [ ext) dé 


By forming the scalar product e, - x we get 


ext x = | (ex* ex)¥(2') dx’ = f (ex e:)x(é) dt 
from which it follows that 


via) = f (ex edx() dé (142) 


The scalar products occurring in the integral will depend, of course, on 
the nature of the observables x and ¢. Therefore we shall evaluate 
the case in which z and é signify the coordinate and momentum com- 
ponent of a particle. According to Section 28 the operator of £ in 
the K system is —(h/i)(0/dx) and has the eigenfunctions ae~/€, 


110 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


where ais still an undetermined constant. These eigenfunctions repre- 
sent the resolution relative to the K axes of a vector e:. The constant 
a can be determined with the aid of the condition expressed by (140). 


Setting e: = | e,ae ““/§ de, we obtain 


a fay [ om=as | (e,* eg )e~ “Ba dz’ 


at fag f mst» ae 


= tim ot [ Asin el — 8 gy 
gue oe 


J (eer ee) ae 


Hence, by (139), 
gle’ — £) 


a = h-* 


d(’ — 8) =a°h2r =1 


lim a@ 


Because e; = h~” / e,e /*'E dx’ and because of (140), we now obtain 
for ez* € 
Pagans | (e2* eye" da! = R-Me— Mat (443) 


Accordingly equation (142) changes over into the relation 
V(x) = 0 J x(Be* de 


which, when applied to ayz, leads to the wave mechanics equation 
(60’). 

In a coordinate system with a continuity of axes, the matrix is 
replaced by a linear operator D by means of which a vector x, described 
by a function (x), is transformed into x’ corresponding to a function 
D(x). The linearity of D is expressed by the conditions 


Dio+¥)=De+Dpy and D(ky) = kDy 


where kis an ordinary number. The operator —(h/7)(0/dx) evidently 
satisfies these conditions, whereas the operator which consists in tak- 
ing the square of the function does not. Of course, a matrix is also a 
linear operator, and hence the notation linear operator may be applied 
to matrices as well. We have seen that the expectation value of an 
observable represented by a matrix A is given by @ = x* Ax. In the 
case of an operator D working on a vector x = ¥(x), we have to express 


Sxc. 31] THE FUNDAMENTAL PROBLEM 111 
} 


i=: Dx = i v*(2) dz / Di(a’)(€e* ex) de’ = / Y*Dy dx (144) 


For a Hermitean matrix we have @ = G@*. Thus a linear operator will 
be Hermitean if 


[ v*Dyae = [ (Dy*vax (145) 


The relation (144) can be used for the transformation of an operator 
into a matrix. We assume K and K’ are systems belonging to observ- 
ables with a continuous and a discrete spectrum of eigenvalues respec- 
tively. For instance, in K the coordinate x may be diagonal and in 
K’' the energy E may be diagonal. In K the operator representing z is 
given by multiplication by x, and in K’ by a matrix Q = || qi |, so that 
the expectation value Z for a state x represented in K by the function 
y(x) and in K’ by xox. -- - - is given by 


E=x'Qx = > ginee*te = [ m0 de 


When > ni is substituted for ¥(x) on the right-hand side, y; being 
the eigenfunctions of the energy operator, we get 


> sini = ee il Vi* apy dx 
and, since the equation must hold for any vector x(rox1 - - * ), 


Vik = | Vi* Vy (146) 


Similarly the elements p,;, of the momentum matrix P are found to be 
h « Wk 
———e€? "lly 
Dik F / iyo ii 


We have already evaluated the integral i V;*xyp;, dx in the case of the 


oscillator in Section 20. As we shall see in the following section, the 
values for x;,;_1 and ;,;41 of (109) agree exactly with those that can be 
deduced from the matrix mechanical formalism. 

31. The Fundamental Problem of Matrix Mechanics. We 
have seen that the essential idea of quantum mechanics is to correlate 
a certain operator to any observable of the system considered. When 
the operators are known, we need only to construct the corresponding 
principal systems K in a Hilbert space and to project onto the axes of 
K the vector which, on the basis of statistical investigations, is found 
to represent the state of the system. If we but know how the opera- 


112 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


tors are to be ascertained for a given system, this prccedure leads to a 
complete description of the state. It is true that we cannot say then 
how the state will change in time, and so there remains the problem of 
finding the law pertaining to dx/di. If we wish to develop matrix 
mechanics as a self-contained discipline, we must not demand the 
assistance of wave mechanics but must find a method which, together 
with the commutation relations, can furnish the operator for any 
observable. It is a matter of course that this problem can be solved 
only if certain physical information about the nature of the system is 
given. The situation is the same as in classical mechanics where a 
description of the system is provided by the Hamiltonean function 
which expresses the energy of the system in terms of coordinates and 
momenta. We shall follow the same method, keeping in mind, how- 
ever, that we must not seek a relationship connecting the value of the 
energy with those of coordinates and momenta. The expression of 
such a connection would be meaningless in quantum mechanics 
wherein coordinate and momentum cannot be measured simul- 
taneously. What we can connect are matrices only, or more generally, 
the operators of the observables mentioned above. Therefore we 
shall adopt the following theorem: 


There is for any physical system a certain relationship between the 
matrix H of the energy and the matrices Q and P of coordinate and 
momentum. 


The function H = H(QP), which describes this relationship, is called 
the Hamiltonian, and we assume that it is defined by the nature of 
the system. How this function is to be found for a given system 
cannot be determined by matrix mechanics, just as classical mechanics 
is unable to explain, for example, that for an oscillator the Hamiltonian 
is given by p?/2m + aq?/2. There is no other instruction for deter- 
mining H than to find the function that gives a correct description of 
the system. It seems probable, however, that there is a far-reaching 
correspondence between the matrix-mechanical and the classical H 
so that, for example, the Hamiltonian of an oscillator may be supposed 
to be : 

2 
H(QP) = = + & 
2m 2 


This suggestion that the classical expression for H be simply translated 
into a matrix function is, however, not unique, for in classical mechan- 
ics the multiplication of two quantities is commutative, whereas in 
matrix mechanics it makes a difference whether we write, for example, 


Src. 31] THE FUNDAMENTAL PROBLEM 113 


Q’P or PQ? or QPQ. However, in such ambiguous cases we are helped 
by the postulate that the matrix must be Hermitean; it is only then 
that we are sure that the expectation value x* Hx of the energy is 
real. According to Section 26 a function H of Hermitean matrices Q 
and P is Hermitean only if it is self-adjunct, that is, if it remains identi- 
cal when read in the inverse order. In the example above, only the 
product QPQ has this property. But this does not exclude Q*P and 
PQ? altogether because the expression 14(Q°P + PQ?) is also Hermit- 
ean and could be used for H. And so, too, the requirement that H 
has to be Hermitean does not help us toward a definite decision, and in 
the end there is no other way to judge whether a possible H is correct 
or not than to put H to the test. 

But let us suppose that for a given system we know the Hamiltonian 
H = H(QP), the matrices H, Q, and P (all of which are considered 
unknown at first) being referred to the normal coordinate system K 
belonging to the energy. Then the method for determining H, Q, P 
is the following: We must find two matrices Q and P which 


(i) satisfy the commutation relation 
QP —PQ = *E (148) 
(ii) gives the function H(QP) the form of a diagonal matrix. 


The second condition must be imposed upon Q and P because K is 
supposed to be the principal system of the energy. Should it turn out 
that the problem has only one solution, Q and P represent the operators 
corresponding to the coordinate and momentum, and H(QP), by its 
diagonal elements, provides the energy values EH; of the system in its 
different stationary states. 

To illustrate the procedure we consider a linear harmonic oscillator 
the Hamiltonian of which may be assumed to be 

P? 2 


ff = act 


We then must solve the equation 


Eo, 0 O 
0 ££, 0 


Bajo 0 2, 


ao 3 (149) 


114 WAVE MECHANICS IN MATRIX FORM (Cu. 3 


by matrices Q and P which fulfill condition (148), Q and P as well as 
Eo, Ei, * * + being unknown. When we multiply (149) on the left 
and the right by Q, upon subtraction we get 


1 
3, (OP* — PQ) = QH — HQ (150) 
m 
On the other hand, (148), when multiplied by P, gives 
QP? — PQP = *P 
so that the left-hand side of (150) may be written 


A (te 4+ PQP a) r + |tp + P(QP - Po | 


~1(tp4te)-4e 
wm 


2m \a t 
Equation (150) then becomes 


h 
_P = QH — HQ 


and for the elements p;;, of P we obtain 


m mi 
Dik = S ps (quhiz — hugix) = r= qik(E; — E;) (151) 


A corresponding equation for the ¢,, terms can be derived. We multi- 
ply (149) on the left and right by P, and upon subtraction we obtain 


5 (PQ —Q?P) = PH — HP (152) 
Furthermore from (148) we find 
h 
Q’P — QPQ = 72 
Accordingly the left-hand side of (152) can be transformed into 
h a h ah 

2 (po? -=Q - 9P0) = “| Po — QP)Q - ro --—@ 

2 t 2 t a 
Thus we obtain 


h 
G5 Gk = ») (palin — hip) = pie(Ex — E:) 


——— 


Sxc. 31] THE FUNDAMENTAL PROBLEM 115 


or, on substituting the expression (151) for pix, 
E in run! B*| =0 (153) 
Vik ah k z = 


It follows that the q;, terms are not equal to zero only for those values 
of i,k for which 


m 
1 — — 3 (Ex — B,)’ = 0 


and thus, since 1/2r Va/m = vo, where vo is the characteristic fre- 
quency of the oscillator, 


Ey — E; = +hvo (154) 


This result suggests the idea of arranging the diagonal elements of the 
matrix H in an arithmetic series beginning with an unknown term « 
and increasing by hvp so that Ey = « + khvp. Then every gj, and py 
term vanishes except those for which k = 7 + 1, and we obtain for the 
first diagonal element of H 


1 a 1 a 
= Ho =e = — =_ = bes 
Hoo eek a ) PorPio + 3 y qort10 = 5 PorPr0 + 5 401910 


Now, because of (151), we have 
mi mi 
Po = % gorivo Pio = — % fiohvo 


and furthermore, because of the Hermitean nature of Q, giz = qui*. 
Therefore go1910 = \qo1|? ; hence the expression for e becomes 
4r?m? v9? “) 
Pa ne a (met ¢ 
o=e |qo1| om Ge 9 


Now, from the commutation relation it follows that 


do1P10 — Po19d1i0 = F 


or 


h 


a 


h 


— |go1|*4rmivy = iota 


gos” = 


Thus we obtain 


h (42 :) hyo 


E = = 
Tiegh VS 2 


116 WAVE MECHANICS IN MATRIX FORM (Cu. 3 


The diagonal elements of H are now found to be E;, = [(2k + 1)/2]hv, 
and we can evaluate the q;, terms as follows. If we take the kth 
diagonal element on both sides, the commutation relation gives 


h 


Qk,k—1Pk—1,k + Vk,k+1Dk+1,k — Dk,k—-19k—1,k — Pk,k+19k+1,k = i 


and thus, because of (151), 
j h 


4rmyvo 


laxta,el> a, lgn—1,xl” = 


Thus the |q,x|” terms also form an arithmetic series with the initial 
term h/4rmvo and the differences h/4rmvo. Therefore 


(k + Mh 


em 
lan. ~ drmyo 


The q,;, terms are now determined to be 


1 |(kK+1)h 
ae * ce oa | lcmencocare ia 
Te+ib = Thb+” = Ghbel oN omy, 


(155) 
S a. 3 | kh 
Qk,kK—-1 = Vk—-1,k = oe N Sia, 
For the p;, terms we obtain 
a + Ihmyo 
Pktik = Pepi = —tY 9 
(156) 


kmhvo 
Pkk—-1 = Pe-1,k = —? a ae 


It can be confirmed easily that the matrices Q and P, by the relation 
p’/2m + aQ?/2, define a diagonal matrix with the elements [(2k + 1)/ 
2|hvo as required. 

32. Unique Nature of the Solution. We have yet to prove that 
there is no other solution of the problem or, at least, none that would 
be different physically from the one we have just found. This can be 
shown by the following argument, which holds for any non-degenerated 
problem. We shall first formulate the problem in another way by 
referring not to the principal system K of the energy, but to a quite 
arbitrary normal coordinate system K» which need not be the principal 
system for any observable. Now we take two Hermitean matrices 
Qo and Py which need only satisfy the commutation relation. When 


Sc. 32] UNIQUE NATURE OF THE SOLUTION 117 


we introduce these matrices into H(QP) we obtain a matrix Ho, and 
the question is to what coordinate system K must Ho be referred in 
order for it to become a diagonal matrix. If we are able to answer 
this, the quantum-mechanical problem is solved. For the transition 
Ko— K transforms Qo and Po into Q = SQoS~! and P = SP)S, 
which also are Hermitean and satisfy the commutation relation. 
Furthermore H = SH)S~? is a diagonal matrix, and thus both postu- 
lates of the preceding section are fulfilled. The problem then is simply 
to determine, for a given operator, the system of the corresponding 
principal axes. This can be done, as we saw in Section 25, essentially 
in one way only. That which remains arbitrary, in the case of a 
non-degenerated problem, is the choice of the unit vectors e, of K 
which may be multiplied by e*”* so that there are different principal 
systems which can be transformed into one another by a unitary 
matrix of the form 

e?, 6....0 0 

D inf 0 . 


a. 2 ee 16 ew 0 6 le 


Senta s 21 \e 6 & we 


It is true that this means there are different K systems, but in all of 
them we get the same diagonal matrix for the energy since H is changed 
by S into H’ = SHS~' = H. The matrices Q and P are transformed 
by S into other matrices, but this is due only to a change of the unit 
vectors and has just as little physical significance here as do the units 
in which the coordinates xyz are measured in classical mechanics. 

Our argument does not prove yet that the quantum-mechanical 
problem is unique, since it could just as well be that we do not obtain 
the same solution with any pair of Hermitean matrices which are 
referred to any coordinate system Ko. However, we shall accept 
without proof that this is not so. 

When studying quantum mechanics the beginner is likely to lose 
sight of the physical meaning, fixing his whole attention on the strange 
formalism. Therefore, using the example of the oscillator, let us make 
it clear once more how quantum-mechanical statements must be 
understood. Let us start with the result that the elements of the 
diagonal energy matrix are E;, = [(2k + 1)/2]hvo. The meaning of 
this is: When an energy measurement is made on an oscillator of mass 
m and elastic force az, the result is always one of the values Eo, Ei, 

With respect to the matrices Q and P we imagine that the 


118 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


oscillator is given in the state x which, relative to the energy, is defined 
by the coordinates 2. Then a great number of measurements of 
position and momentum, carried out on a correspondingly great num- 
ber of oscillators all in the same state, on the average, furnish the mean 


values = y Qint;*x, and p = > paste. If we wish to know the 


probability of a measurement of x or p giving a certain value a, we 
have to construct the principal systems K’ and K’’ which correspond 
to Q and P and project the vector x onto the axis which belongs to the 
eigenvalue a. 

33. The Dynamical Law of Quantum Mechanics. The 
Principle of Causality. We now have to answer the following 
question. If the state of a closed system is given for time ¢ = 0, 
that is, if we know the vector x that represents it at that instant, what 
can be said about the state of the system at a later time #? Evidently 
the answer to this question depends on the sense in which the concept 
of causality is understood. In Section 4 we saw that there also must 
exist some sort of causality in quantum mechanics. If we were to 
maintain that the initial state has no bearing on the future, we would 
bring into question the possibility of physics at all. On the other 
hand, however, there can be no doubt that the significance of causality 
in. quantum mechanics must be quite different from its significance in 
classical physics. The classical position was that, if we know the 
state of a closed system for ¢ = 0, this state being defined by exact 
values of certain observables, we know the exact values of the observa- 
bles for all future time. Quantum mechanics must deny this principle 
on the ground that it has no physical meaning. We cannot measure 
simultaneously the exact values of all observables, and hence we are 
bound, in speaking of “‘states of a system,’’ to use the concept of 
state in a sense that conforms with the experimental possibilities. 
According to these possibilities we can describe a given state by a 
vector x, and the only question is whether it is possible from the vector 
xo, Which describes the initial state, to conclude the vector x defining 
the state at a future time. We make the assumption that there is a 
law with the help of which x can be evaluated from xo and accept the 
existence of such a law for the quantum-mechanical idea of causality. 
Thus the theorem that, if we know the initial state of a closed system, 
we also know the state for any future time holds both in quantum mechan- 
ics and in classical physics, the only difference being that the concept 
“state” is now given another interpretation which is in accordance 
with the possibilities of observation. 

The situation then is this: Quantum mechanics agrees with the view- 


— 


tf 


Sec. 33] THE DYNAMICAL LAW OF QUANTUM MECHANICS 119 


point that there must be a way to conclude the future from the present, 
but the conclusion must not be more exact than the premise. From 
one probability we can infer only another probability. The only 
thing we know about the present is that the measurement of an observ- 
able will furnish a given value with a certain probability, and we 
cannot draw conclusions about the future from this knowledge which 
would give us more exact knowledge. Such knowledge would be of 
no use to us anyway, since it would be impossible to subject it to 
experimental test. 

On the basis of these preliminary remarks we are now going to 
formulate a law by means of which the sequence of future states can 
be foreseen. As a matter of fact we are already familiar with the law, 
for it is contained in the postulate that the probability amplitudes 2; of 
the energy, as far as time-dependence is concerned, are given by 


Le = ape Ext (157) 


This equation implies time-dependence not only for the energy but for 
all other observables as well. In the case of energy, we have already 
emphasized in Section 16, that the probabilities x;,*z; of the different 
energy values do not vary with time; consequently the energy is sure 
of retaining the value Z; if its value at ¢ = 0 is sure of being Ey. This 
is the way in which quantum mechanics expresses the conservation of 
energy. The other observables, however, with the exception of those 
which commute with energy, are not subject to the conservation law. 
For example, consider some observable a. The probability of a value 
a; being found for a at ¢ = 0, where a; is an eigenvalue of a, is given by 


» sine’) (158) 


where ||s;z|| is the transformation matrix that effects the transition 
from the principal energy system K to the system K’ of a. (For the 
sake of simplicity we assume a discrete eigenspectrum for a.) On 
the other hand, the probability for time ¢ is given by 


> siprpre /®) Fat 2 (159) 


which agrees with (158) only when a definite energy state is involved. 
Differentiating (157) relative to t, we obtain 


i 5 Bumete Et 5, Bate (160) 


120 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


which can be put in the form of a matrix equation: 
Hx (161) 


In the product, x is considered a matrix the first column of which con- 
tains the x; terms, all the other elements being zero. JH is the matrix 
of the energy, that is, a diagonal matrix with the elements H;,. From 
the multiplication rule it is readily seen that Hx is a matrix with the 
elements #2; in the first column, all the others being zero. Equation 
(161) is considered a fundamental law which for quantum mechanics 
means the same as Newton’s second law for classical mechanics. 
When the vector x changes with time, the change, of course, has 
nothing to do with the principal systems belonging to the different 
observables. The change concerns the components of x only, whereas 
the axes of the systems remain fixed in space. In other words, this 
means that the operators Q and P are independent of time. But, as 
only the position of x relative to the coordinate systems is of signifi- 
cance, we can describe the time-dependence of the system just as 
well on the supposition that x remains fixed in space, and the operators 
Q, P, and so on are changed. Then we must proceed in the following 
way. We consider a matrix A belonging to some observable of the 
system and assume S to be the transformation matrix corresponding 
to the transition of the principal energy system K to the system K’ of 
A. For the components of x belonging to the ith axis of K’ we obtain 


x! = > sate 


dod _ YS, dae 
dt * dt 
In this equation S is considered constant and x is considered as the 
variable. If, however, x is considered to be constant, then in order 


to obtain the same dz;’/dt we must substitute such functions of ¢ for 
the s;, terms that 


Therefore 


a, ge 
Paies noge wpe © 
or, written in the form of a matrix equation, 
d 
S = = x 


dt dt 


Src. 33] THE DYNAMICAL LAW OF QUANTUM MECHANICS 121 


When we use (161), this becomes 


t dS 
7 SH x = a x 
Therefore 
aS 7 
fet we 


We evaluate dA /dtfrom dS/dt. To accomplish this, we first transform 
A with the help of S to the principal system K’ of A, obtaining A’ = 
SAS“, or 

A'S = SA (162) 


Because of the significance of K’, A’ is a diagonal matrix independent of 
time, because its elements are the eigenvalues of A’ and these are inde- 
pendent of time. Upon differentiating (162) relative to time, we 
obtain 


or 


£(A'SH - SHA) = a 


and, because of (162), 


a4 == + (AH - HA) (163) 
dt 
Accordingly the time-dependence of a system can be represented not 
only by the dynamical law, (161), but also in such a way that the vector 
x is kept constant while the matrix of any observable is considered a func- 
tion of time satisfying (163). 

From (163) we see again that the systems remain unchanged only 
relative to observables that can be measured simultaneously with the 
energy and are therefore represented by matrices that commute with 
H. Only then does dA/dt = 0, that is, the corresponding principal 
system remains fixed in space so that the probability of any eigenvalue 
does not change with time. 

When we refer all the matrices to the principal energy system K, 
which system is considered fixed in space, H consists only of its diagonal 
elements EZ; and therefore the decomposition of (163) furnishes the 
relations 

dai 
dt 


L ; 
ath aix(E, — E;) Gn = age Or—s0t (164) 


122 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


It is remarkable that equations (163) for A = Q and A = P can be 
given a form in which they correspond exactly to the canonical equa- 
tions of classical mechanics: 

dq_ oH dp _ oH 


a fp. des Be 


To verify this we make use of the theorem that, if F(QP) is an arbitrary 
polynomial of the canonical matrices Q and P, 


=a5 = OF Fe (165) 


The equation certainly holds for F = P, for it then becomes 


"8 = QP —PQ 


which is the commutation relation that is assumed to be satisfied by 
QandP. For the same reason it holds forF = Q. If, however, equa- 
tion (165) holds for two special functions F; and Fe, it must hold for 
the sum F; + F», and the product FF», for we have 


h oF h (OF OF 
fae opi + Pp) = Oia — FFs + FiQF. — FiFQ 
= QFiF, — FiFQ 


Accordingly (165) may be applied to any polynomial formed from Q 
and P and to any sum of such polynomials. Hence the theorem stated 
is true. Similarly it can be shown that, for any F(QP), 


h oF 
720 = FP — PF 
Thus es: for A = Q and A = P may be written in the form 
di H 
«= = (QH - HO) = 2 “ = * (PH - HP) = — = (166) 


In these equations for the matrices Q, 9H /dP, and so on, the Hermitean 
form x+Qx, x-(dH/dP)x, and so on, representing the expectation 
values of Q, 0H/dP, + - - , may be substituted and x must be con- 
sidered constant. Thus we return to the relations mentioned at the 
end of Section 15. The internal connection between quantum mechan- 
ics and classical theory is especially clear here and can be characterized 
by this statement: The laws of classical mechanics also hold in quantum 


Src. 34] SYSTEMS WITH MANY DEGREES OF FREEDOM 123 


mechanics provided that they are considered relations between statistical 
mean values. 

34. Systems with Many Degrees of Freedom. Up to this point, 
for the sake of simplicity, we have considered systems with only one 
degree of freedom. If there are n degrees of freedom, where n > 1, 
as when a particle can have motion in all directions, we must correlate 
a matrix Q; to each of the n coordinates g, by which the spatial con- 
figuration of the system is described. A canonically conjugate 
momentum P; must be coordinated to every Q,, any pair Q;P; satisfy- 
ing the commutation relation 


QePx — PiQe = “i (167) 


whereas, if 7 ¥ k, 
QiP: — PQ: = 0 (168) 


because two entities that can be characterized by different indices 
can be observed simultaneously and thus the corresponding matrices 
can be transformed to the same system K of principal axes. In 
addition there are the relations 


Q.9:-20:=0 P,P; —PP, =0 


From these we observe that all the principal systems belonging to 
Q:, Q2, +: + form a system K of the following kind. With any axis 
of K we can correlate an eigenvalue a,‘ of Q;, an eigenvalue a,” of 
Qs, and so on, so that we can speak of the axis am‘), a», 

When the vector x which describes the state has the divictie of this 
axis, a simultaneous measurement of the coordinates qi, qo, - 
will, with certainty, furnish the results am‘), an, + -°:. Bvidently 
the eigenvalues am‘), an, - + - of Qi, Qo, - - + may be associated in 
any combination so that there is an axis of K for any set of these. 
Thus the principal system of any Q; is degenerated, that is, there exists 
for any eigenvalue a» of Q; an (n — 1)-times infinite manifold of 
different axes every one of which is correlated to another combination 
of am with the eigenvalues of the other Q. The same holds for the 
P;, operators. They too are degenerated, and we can correlate an axis 
to any combination of eigenvalues Bm‘?, Bn‘, 

It is easy to answer the question of the ‘aannee in n pehioh: the state 
of a system, subjected to given physical conditions, is to be investigated 
and the question of requirements the state has to satisfy in order to 
be ‘‘well-defined”’ in the sense of Section 27. We consider a great 
number of systems and form a statistical ensemble for the group by 


124 WAVE MECHANICS IN MATRIX FORM [Cx. 3 


measuring the values of the coordinates q:, g2, - - - for any one of 
them. In this way we find a certain probability |tem‘ en? toe |? for 
any set of i ® an, + + + ,and similarly the probabilities 
|2t’omDp, ° + + |? of arid’ Bm Bn  .. + for the momenta can be 
ascertained. Ifthe probabilities for any large assemblage are identical, 
we call the state “well-defined” and represent it by a vector the 
squares of magnitudes of the components of which, relative to the 
principal systems of coordinates and momenta, give the observed 
probabilities. 

Of course, such a representation is possible only after the principal 
systems of Q; and P; have been ascertained. For this purpose we 
adopt the same method as for a system with one degree of freedom. 
We first consider an arbitrarily chosen normal coordinate system Ky 
and determine a pair of matrices Q;°, P;° that satisfy (167) and (168). 
With the determination of these pairs the problem is solved, since 
then we know the systems K’ and K” which are formed by the principal 
axes of Q;,° and P;,° respectively. In order to find the principal energy 
system K, a unitary transition Ky — K has to be performed by means 
of which the given Hamiltonian H(Qi, Qo, -*- , Pi, Po, --: ) of 
the system is transformed into a diagonal matrix. In the case of a 
non-degenerated H, that is, if there is only one state x of a given energy, 
the principal system K is uniquely determined by H. Then there 
exists only one unitary matrix S which effects the transition Ky > K 
and transforms the matrices Q;° and P,° into Q, = SQ;°S~! and 
Pe = SP;°S™*. 

In general, however, H is degenerated. Then there exist r;, differ- 
ent axes of K, all belonging to the same eigenvalue H;, or, in other 
words, there are 1; .different well-defined states of the same energy 
E;. For example, let us assume that the eigenvalue HZ; is twice 
degenerated, so that the first three axes of K (which may be designated 
by 1, 2, 3) correspond to the same energy Hj. These three axes then 
will scaffold a three-dimensional space containing all the vectors x 
for which only the components 21, x2, and 23 differ from zero. Any of 
these vectors is transformed by H into a multiple Z, of itself, for since 
H is of the form 


E,; 0 0 
0 EF, 0 


Sec. 34) SYSTEMS WITH MANY DEGREES OF FREEDOM 125 
the equation y = Hx gives 
Y1 = E\x Y2 = E\x2 ¥3 = E23 Y4 = Y5 re eo 0 


Thus any vector x belonging to the space scaffold of the axes 1, 2, 3 
represents an eigenvector of H, the corresponding eigenvalue being Fi, 
so that x indicates a state of sharply measured Z,. This means, how- 
ever, that, instead of the 1, 2, 3 directions, we may choose as well three other 
normal directions of the space E, for the first three axes of K. Because of 
this arbitrariness, the transformation to principal axes in the case 
of a degenerated H is not unique but permits different solutions. 
Together with K, K; is also a solution which can be obtained from K 
with the aid of a unitary matrix of the kind 


So) 0 0 
O. gt? 6 


g- ° aig 


(169) 


where S“» is a unitary matrix of the order r,. 

In the case of a degenerated H, the following point should be brought 
out: Let us assume that a given state of a system is represented by x, 
its components relative to K being zz. How are we to determine the 
probability that an energy measurement will give the value E, if 
this value occurs rz times? If H were non-degenerated, this proba- 
bility would be given by the square of the component of x in that 
direction. If there are more than one E;, directions, this rule is mean- 
ingless and we must modify it as follows: The probability in question 


equals the sum > mete extended over all the x; components that belong 


to the axes of the same Ey. This postulate is consistent with the whole 
plan, since the assumed probability is indifferent to a possible change 
of the principal system K, for, if K is transformed by the matrix (169) 
into K;, the components zx; of the space E; are altered in such a way 


that 
> az = > ay! *r4! 


We again obtain for the expectation value E the expression po ry * ty, Ly 


in which all the components of x are to be considered so that every E;, 
contributes as many terms as the postulate about the probability of 
E, demands. 


126 WAVE MECHANICS IN MATRIX FORM [Cu. 3 


PROBLEMS 


1. Prove that the commutability of two matrices is not only a necessary but 
also a sufficient condition that the matrices can be diagonalized. 
2. Given three matrices which satisfy the relations 


AB — BA = 2iC 
BC-—CB=2i1A and A*?=B? =(% 21 
CA — AC = 2iB 
show that the following equations must hold: 
AB = —BA BC = —CB CA =-AC and ABC =i 


3. Given the same matrices as in Problem 2. If A, B, and C are considered the 
components of a vector S, the components of S referred to a new set of axes are 
given by 

A’ = aA + BiB + iC, ete. 
Show that A’, B’, and C’ satisfy the same equations as A, B, C. 

4, From C? = 1 in Problem 2, it can be inferred that the eigenvalues of C must 
be +1 and —1. Thus, for example, C in its diagonal form will be given by 


c=[5 24] 


Show that A and B are then given by 


Te te ba 


5. Find the eigenvalues of the matrix 


0 0 0 1 
0 0 -1 0 
0 Cee | 0 
-1 00 0 
6. Show that for Dirac’s function 6(x) the following equations hold: 
x 6(xz) =0 5(ax) = a! &(z) 


s(x) =0 = f(x) 6( — a) = f(a) 6(e — a) 


7. The components of angular momentum are m, = YPz — Zpy, etc., with 
Pz = (h/t)(/dxz). Prove that the following relations hold. (The notation 
[ab] is an abbreviation for ab — ba.) 


[ma] = y [my] = —x [mz] = 0 
[mzpz] = Py [mzpy] = —Pz [mzpz] = 0 
[mymz] = mz [mmz] =m, [mm] = m: 


8. Prove that, if @ = mz” + m,? + m,?, [mz6] = [m,6] = [m.6] = 0. 
9. Discuss the transformation A’ > SAS if the unitary matrix S differs from 
unity by a small matrix C the square of which may be neglected. 
10. Describe the connection between the commutation relation and the uncer- 
tainty relation. 


4 
PERTURBATION THEORY 


35. Perturbation of Non-degenerated Systems. Thus far we 
have assumed always that the systems were closed, and on this assump- 
tion was developed a theory which cannot be applied to many impor- 
tant problems. The theory fails when some external force, for exam- 
ple an electric or magnetic field or the field of a light wave, acts on the 
system. We know from experience that certain characteristic effects 
are observed in this case, effects that can be explained by quantum 
mechanics only by using a theory that applies to open systems as 
well. It is quite true that we could subscribe to the view that there 
is no need for'an elaborate perturbation theory because a perturbed 
system can always be formed into a closed one by uniting it with the 
perturbing system. In most cases, however, this is not practical 
because it leads to equations that cannot be solved with the ordinary 
means of mathematical analysis; therefore we must resort to other 
methods. In what follows we present a method which is useful pro- 
vided that the perturbation is sufficiently weak. 

Let us assume that the energy of the unperturbed system is non- 
degenerated. When there is no perturbation, the solution of the 
problem can be known and is given by the canonical matrices Qo and 
P» with which the function Ho(QP) becomes diagonal. 


E; 0 0O 

Oo ts 0 

OP = OF Si 
Hy =HoQPo) =|. . 


Let the coordinate system associated with QoP» and Ho be denoted 
by Ko. Now assume that the perturbation is taking place. The rela- 
tion between energy, coordinates, and momenta changes from H (QP) 
to H(QP). If the perturbation is sufficiently weak, Ho and H differ 
but slightly and we may write 


H(QP) = H.(QP) + H'(@QP) (170) 
127 


128 PERTURBATION THEORY [Cu. 4 


where H’(QP) is a small matrix function of (QP) which depends on the 
nature of the perturbation and is assumed to be self-adjunct so that H 
as well as Ho is Hermitean. We find an approximate solution for the 
perturbed system in the following way. As the unperturbed system is 
supposed to be non-degenerated, all its eigenvalues E,° will differ 
from each other. If we denote the corresponding eigenvectors by 
xz°, then Hox; = E,°x,°. The perturbation will change the eigen- 
values and eigenvectors to E;, and x; respectively. Because of the 
smallness of the perturbation, the differences EZ, — E,° and x, — 


x;.° will be small and we may write, for Z;, and Xk, 
Ey = E,°+E,'+ E+ --: 
k k k k (171) 
xe = xe txt +-°- 


in which x;! and E£;! are terms of the same degree of smallness as the 
perturbation itself, x,” and E,” are small to the second degree, and so 
on. Then Hx; = E;x; becomes 
(Ho + H’)(xp° + x.) +? +++) 

= (+H +E +++: )\ +t tu? +--+) 


By separating terms of different orders of magnitudes, we obtain 


Hox;,° = E,°x;° (172) 
Hox;) + H’x;° = Ex°x,) + Exlx;,° (173) 
Hox,” + H’x;,) = Ey,°x;? + Ey, 1x,! + E,?x;,° (174) 


Equation (172) shows again that E,° and x;,° represent a state of the 
unperturbed system. From (173) we can evaluate the first-order 
corrections, x,/ and E,}. To do this we take from all the vectors 
of (173) the components referring to the nth axis of the principal 
system Ko, which belongs to the energy of the unperturbed system. 
If we denote the components of x,! in Ko by xn’, then Hox;! is a vector 
with the components E,°x,’, because Ho is diagonal in Ky and thus 


transforms a vector ao’, 21’, °° * into Eo°x', E,°x’,---. Fur- 
thermore H’x;° is a vector with components ) H,,;'x; = H nk’, Since of 


T 
the components of x;° only 2; is unity, all the rest being zero. Finally, 
the components of the vectors on the right-hand side are E;,°r,' and 
E,*5n% wherein, for n = k, dnz = 1, and, forn #k, dn, = 0. Thus 
(178) gives 


E, tx! + An’ = E,°x,! ta Ey Sak 


Src. 35] PERTURBATION OF NON-DEGENERATED SYSTEMS 129 


and, when n = k, we obtain 
E,' = Hix’ (175) 


Therefore the correction term E;,! of the first order for the kth eigenstate is 
given by the diagonal element Hy,’ of the perturbation matrix. On the 
other hand, when n # k, we have 

Pea H nk 

- E, — E,° 


, 


ope (176) 
The x,’ terms are the components of the correction vector x;. Thus, 
by (176), we can evaluate the x,’ vectors with the exception of the kth 
component, which remains arbitrary. 

In a corresponding way the correction terms of the second order 
can be found from (174). Again we take from all the vectors the 
components in the direction of the nth axis of Ko, which we shall 
designate by z,’’._ We obtain 


Ena,” + pi Hryi'tt = Ex tn" + Ex'tn' + Ex? Snk 
I 


which, for n = k, gives 
E,? = yy Aas'xi! — E;'xx) = > Baa’ — Hy’ x,’ 
7 


and, because of (176), 
Hy Hy’ 
© as ) At Aik 
Ex Ei, —E (177) 
if 


For the case n ¥ k we obtain an equation from which the x,” com- 
ponents of the vector x,” can be evaluated. It is not necessary to 
explain how this process is continued in order to get the correction 
terms of higher order. 

The formulas derived can be translated without difficulty into the 
notation of wave mechanics. Instead of Ho we then have an operator 
L which acts on a function ¥(zyz), determining the eigenfunctions 
y;,° and the eigenvalues E;,° of the unperturbed system by the equation 
Ly =Ey. The perturbation changesL into L + L’ with the eigenquanti- 
ties ¥, and Ey. The correction terms E;’ and y;’ can then be evaluated 
from (175) and (176). Making use of the rules of Section 30, we obtain 


E;' = il veo *L' py? dv 
fea LY? do 


tn = EP —E. (for n # k) (178) 


130 ; PERTURBATION THEORY [Cu. 4 


and, from z,,’, we find 


36. Perturbation of Degenerated Systems. Now we consider 
the case of a system, such as the H atom, which is degenerated if no 
perturbation acts on it, so that there are different states having the 
same energy H;. It may turn out that the degeneracy is cancelled 
by the perturbation, with the result that an originally single spectral 
line is resolved into a group of lines. If in the unperturbed atom the 
energy H; is degenerated r; — 1 times and E; is degenerated r;, — 1 
times, the line vj, = (Z; — E,)/h represents a complexity of rjr; lines 
of different origins which coincide in the line v,, and decompose if the 
respective #; and E; values are influenced in a different way by the 
perturbation. 

Thus we assume now an energy matrix Ho of the unperturbed system 
which, when referred to the corresponding principal system Ko, is 
diagonal and contains the Zp term ry times, the Z; terms r; times, and 
so on. Accordingly there exist for any of the values E,, r; different 
eigenvectors which can be designated by x;,°, xz.°, - - - and which 
scaffold an orthogonal coordinate system in the EZ, space. The x;,° 
vectors are not uniquely determined, since we can exchange the coor- 
dinate system for any other orthogonal one without changing the 
matrix Ho. Then we obtain other eigenvectors which are also trans- 
formed by Ho into a multiple H;, of themselves. This implies a certain 
complication in the theory. If we write the eigenvectors x;,, xz., ° 


of the perturbed system in the form 
Xe = Xe + XE + mE + (179) 


the terms after the first on the right are small only if x;,° is as near as 
possible to x;,;. It is assumed that we know the x;,° vectors which 
satisfy this condition although, a priori, we cannot say how these are 
to be found. However, this will be learned from the equations to 
which the assumption leads us. As in Section 35, these equations are 


Hoxz;° = Ex xn;® 
Hoxe,;' + H’'xp;° = Ey°xe;' + Ex xn? (180) 


Now we choose one of the principal axes n of Ho. This need not belong 
to the EZ; space. If, as we did in Section 35, we take all the components 


Sec. 37] PERTURBATION AS CAUSING TRANSITIONS 131 


of the vectors in (180) which have the direction of n, we obtain 
y Ayia! + » Hypa; = Extn! + Extn 


the components of x;,° and x;,! being x; and 2,’ respectively. 

We apply this equation first for the case in which the direction of n 
lies within the Z;, space. Then, since Hp is diagonal, H,,;° differs from 
zero only for i = n and equals Z,°. The equation reduces to 


3 Hy = E,, tn (181) 


As the x; terms in (181) signify the components of the vector XE; °, 
they differ from zero only within the EZ; space, and hence the summa- 
tion above is to be extended over that space only, that is, the sum in 
(181) applies only to that part of the matrix H,’ which corresponds to 
the E; space. Therefore, in this case, (181) becomes an eigenvector 
problem, for we have to find those vectors x,” the components 2; of 
which are transformed by H’ into multiples of themselves. To solve 
this problem that part of H’ which refers to the Z; space must be made 
diagonal. The coordinate system in which H’ is diagonal fixes the 


orthogonal directions of xx,°, xz,°, ° * * which were used in (179) as 
satisfying the condition that x;;1,x,,”, - - * become small in this case. 
From (181) we see that the correction terms E;,1, Ex:', * * * of the 


energy E; are given by the elements of that part of H’ which corres- 
ponds to the H;, space and is diagonal. 

Unless this part of H’ is degenerated again, the degeneracy is can- 
celled and the consequent procedure is the same as in the preceding 
section. 

37. Perturbation as Causing Transitions. The purpose of the 
method described in the two preceding sections was to evaluate the 
terms with which we must correct the energy levels and the eigenfunc- 
tions of the unperturbed system in order to obtain the characteristic 
quantities of the perturbed system. Very frequently, however, we are 
interested in another problem: Let us assume that the perturbation 
has been effective for some time. Before action starts, the system may 
have been in one of the eigenstates of the matrix Ho. Now let H’ 
begin to operate, and then after a time ¢ let its action cease. The 
system, again being unperturbed, is now investigated and is found to 
be in some other eigenstate of Hy rather than in the original one. That 
which we wish to determine is the probability of finding the system at 
time ¢ in an arbitrarily given final state. 


132 PERTURBATION THEORY [Cu. 4 


For the solution of this problem we assume that the system is in the 
state x,, satisfying the equation Hox, = E;,x,, once the perturbation has 
started. Because of the perturbation, the state changes according to 
the equation 

h dx a 

idt 
Resolving the vectors into components relative to the principal system 
Ko of Ho, we obtain 


Bat» 
a dt 


(Ho + H’)x 


= Ent, + » Hi om (182) 


For H’ = 0, the solution of this equation would be 


tn = Bnd Ent 
where z,° = 5,%. In other words, the system would remain in the 
state E,. The perturbation makes this impossible, thus causing the 
state to change with time. To describe this process it is convenient to 
give the solution of (182) in the form 


ln = En(t)e/™2nt 


where £(é), which replaces 7,°, is a function of ¢ which is to be deter- 
mined. Then, for (182), we obtain 


h dén ’ (i/h) (Ey, —E,,)t 
Raat ae m Sm 5 nf aged 1 
a ) an one (183) 
If we assume a small H’, this equation, which is still exact, can be 


solved to a first approximation if we substitute for the £,,(¢) terms their 
values at t = 0; and thus £m = 5m. Then we obtain 


hdé : 
plaid Bax’ (i/h) (By, —E,,)t 
i dt ay 
Therefore 
eM) (Et 
En(t) = Ani’ ich a Fagg (184) 
n 
and 


lén(t)|? = |Hnx’|? ie inkh ee (184’) 


4 
(, — BE)? E.)? sin Oh 
lé,|? is the probability with which a measurement of the energy, 
carried out on the system at time ¢, at which time the perturbation has 
ceased, gives the value Z,. Thus the effect of the perturbation is that 


Szc. 37] PERTURBATION AS CAUSING TRANSITIONS 133 


the system does not maintain its original state but in time ¢ changes 
into another state, the probability of the transition being given by 
(184’), which, we note, is proportional to |Hnx'|”. 

It may happen that H,,’ vanishes for certain values of n and k, 
Then, to a first approximation, the probability of a transition k — n 
is zero. However, the transition can be effected through the inter- 
vention of one or more intermediate states, the system first going from 
k to mand then from mton. The necessary condition for this process 
is that both Him’ and Hn’ be not zero. On proceeding from the first 
to the second approximation, we must carry out the calculation as 
follows: According to (184) we have 


ei) (By—B yt _ 
E; 4 En 
Introducing this into (183), we obtain 


Bde Shop! ogy aloes ae 
ae nm 41mk 
m 


Em(t) =H mk 


i dt Ei, — Em 
Hence 
Ham Hint’ (—— 9 | ev» (By—2,)t — ) 
i De ee ee eS (188 
a(t) H-fak Boe. Fr. —£, 30) 


As we shall see, the second term of this expression is generally negligi- 
ble; therefore we can write 


4 c (Ex an E,)t 
a= 2 fe a ne Seat 
|én(d)|? = |H’| i, — Be)? sin = (186) 
where 
Hoth! 
= —__ ene T= MN 
H ay 


which now replaces H,,’. As the product Hnx’H nm’ now occurs in En, 
the probability of finding the system in the state n is now small in the 
fourth order rather than in the second order. Where there are no 
intermediate states for which Hy,’ and H»,’ are not zero, we can, of 
course, continue the procedure by considering transitions that are 
effected by the intervention of two or more intermediate states. 

In the application of perturbation theory we are frequently con- 
fronted with the case wherein there exist a great number of states the 
energies of which nearly agree with that of the considered final state. 
For example, let us consider the case of an atom interacting with the 
surrounding radiation. The emission of light is caused by a transition 


134 PERTURBATION THEORY [Cu. 4 


of the system (which consists of atom and radiation) by which the 
state of the atom is changed from a into b, and simultaneously the 
quantum number of a radiation oscillator is increased by 1. (The 
significance of a radiation oscillator will be explained in Chapter 7.) 
Now the frequencies of the oscillators of a cavity radiation form a very 
dense line spectrum, and it is therefore of no importance to know the 
probability that a certain oscillator will be changed by the process. 
It is sufficient to ascertain the probability that, by the transition 
a— b of the atom, some oscillator or other of a small spectral interval 
is induced to change its quantum number. This probability can be 
calculated in the following way. We denote the number of the final 
states which have an energy between FE and E + dE by p(B) dE. 
(In the example considered, p(Z) d# is the number of oscillators which, 
when their quantum number is increased by 1 and the state of the 
atom simultaneously is changed into b, correspond to the total energy 
of the system between E and E + dE.) The probability of a transi- 
tion into one of these states is found then by multiplying (184’) by 
p(Z) dE. We obtain 
w(E) dE = |é|"p(E) dE 

_ Ile o (Fo — (Eo — E)t 
= |F’| (i, — m2 Qh 
|H’|? is to be taken as the mean value of |Hn,'|? for the considered 
final states. The initial energy and the final energy of the system are 
denoted by Zo and E. The factor w(Z), as a function of Z, has a 
maximum of the amount |H’|?(t?/h?)p(Zo) at E = Eo, that is, for a 
final state the energy of which equals that of the initial state provided 
that ¢ is sufficiently great. Hence the probability is noticeably differ- 
ent from zero only for those transitions which satisfy the principle of 
energy conservation. We find the total probability of such a transi- 
tion which takes place within time ¢ (in our example, the transition 
a— b of the atom together with the emission of a corresponding light 
quantum) by integrating (187) over a small interval AZ surrounding 
the final state H = Eo. Since w(Z) is very small outside AE, we may 
just as well integrate over AZ from — © to +, and, since 


+? sin? x 
yz dz =a 
—« zx 


2 
sin 7 ode = = |H| 
z 


p(E) dE (187) 


we obtain 


W = J w(E) dE = sow) [ 


sor 


p( Eo) 
(188) 


Sc. 37] PERTURBATION AS CAUSING TRANSITIONS 135 


An expression of this form holds also if the transition 0 — n is only 


possible by the intervention of intermediate states, the only difference 
2 


/ 

Ham Ho!’ saust be substituted for |H’|*. Then the 
E ia En 
second term with the denominator ZH, — En, occurring in (185), can 
be omitted (except for EZ, = Em), since it is, when compared with the 
first term, very small and thus its contribution to W is negligible. 
Furthermore transitions are possible only when energy is conserved. 
It should be noticed, however, that the transitions into the intermedi- 
ate states are not restricted by this condition. The latter may con- 
tradict the conservation law, the intermediate states being permitted 
to differ in energy from the initial state by any amount. This does 
not mean that the conservation principle loses its validity because the 
intermediate states as such cannot be measured. The conservation 
of energy enters the question only when we investigate the probability 
of a transition to a final state with energy EH which differs from Eo. 
According to (187), this probability turns out to be negligibly small. 


being that 


PROBLEMS 


1. Compute the first-order perturbation of a system with the help of a unitary 
transformation S = 1 + C which makes H + H’ diagonal, if only terms of the 
first order in C are considered. 

2. Consider an oscillator on which an extra potential az* is acting. What is 
the effect of this potential on the energy levels? 

3. The energy caused by the action of an electric field on an atom is given by 


er> E = er+ Eo cos 2rvt = Mer - Eo(e™"* + e—?7") 


where r is the vector of the electron. If E has the direction of the z axis, the energy 
may be represented by 4eQE. Find how the vector representing the state of the 
atom changes with time. 

4, Calculate the third-order terms of a perturbation. 


3 
SYSTEMS OF MANY PARTICLES 


38. Schroedinger’s Equation for the Many-Body Problem. 
The systems considered so far have been assumed to consist exclusively 
of one movable particle, and thus the theory we have at our disposal 
suffices only for systems of the simplest kind such as the oscillator and 
the H atom with fixed nucleus. Therefore it is insufficient for cases 
such as the He atom or the Hz molecule. It is to be expected that we 
should encounter a more complex situation when we attempt to extend 
the theory to systems containing many movable particles. First of 
all, the greater number of particles will complicate the calculations. 
In addition, there is another point which has been of no consequence 
in the classical theory for the many-body problem but which is of 
decisive importance for the epistemologically prejudiced quantum 
mechanics. The situation is this: On principle, there is no possibility 
of distinguishing particles of the same kind, such as electrons, from one 
another. Let us imagine that the n electrons of an atom are numbered 
1, 2, 3,°-*+*,n. Now, if a measurement carried out on the atom 
shows that at a given instant of time the electrons are in the positions 
Pi, Po, * + + , Pn, then on principle we cannot state which position is 
occupied by the electron with the number 7. Thus, if P; and P2 for 
two particles are found by measurement, the interpretation could be 
either that P; belongs to electron 1 and P; to electron 2 or that electron 
1 is at P2 and electron 2 at P;. This inability to distinguish the par- 
ticles leads to very peculiar consequences which are quite unknown in 
classical mechanics. 

To take a simple case, let n = 2 and denote the coordinates of the 
two particles by x,y121 and eyez respectively. At first consider the 
particles to be distinguishable. Then the state of the system can be 
described in wave mechanics by a function u(2x1y12122y222t) which has 
the following significance: the function u when multiplied by the 
volume element dx,dy,dz,dxody2dz2 of the six-dimensional configura- 
tion space gives the probability that a measurement, carried out at 
time ¢, will find particle 1 within dzidy;dz; and particle 2 within 
dx2dy2dz2. In wave mechanics it will be possible to treat the many- 
body problem by essentially the same method as a single particle, 

136 


Szc. 38] SCHROEDINGER’S SOLUTION 137 


the only difference being that the probability waves u(x1y121%2yo2et) 
are propagated in six-dimensional space rather than in three-dimen- 
sional. This involves no difficulty in the statistical theory, according 
to which the waves have a purely symbolic meaning. 

The first thing to do is to set up Schroedinger’s equation, and for 
this purpose we begin, according to Section 31, with the classical 
equation for the Hamiltonian H of the system, which is given by 


EP ta? t+ tr 4s Eo” + no” + £2” 


2m 2m + V(aiyiziteyoz2) (189) 


where V(21y121%2¥222) is the potential energy of the system, the 
particles being at x,yiz and 2ey2z2 respectively. On substituting 
—(h/t)(0/dx1), - + + , —(h/2)(0/dz2) for & - + - 2 and denoting by 
(v1 * * * 22) an eigenfunction of the energy, with the eigenvalue E, 


i Ge (= any oy) rake al (= La! &y) 
2m, 0x1" dy” 021" 2me 0x9” dy2” 029” 
+V(a1 +--+ 2) = Ey (190) 


Now suppose that the particles are indistinguishable. Then the 
potential energy V(x, - - + zg) is bound to be symmetric in the indices 
1 and 2, that is, V must remain identical when 1 and 2 are interchanged 
because this interchange means only that the particles exchange posi- 
tions, and this cannot influence the energy because the particles are 
exactly the same kind. Let us assume further that the particles are 
bound by forces of.some kind or other to a nucleus which has a fixed 
position. Then the energy V can be resolved into two parts, the first 
being due to the attraction of the nucleus and of the form V’(x,y,z1) + 
V'(xeyez2) and the second part V’’(x1y121%2y222) corresponding to the 
forces with which the particles act on one another. Equation (190) 
can then be put into the form 


h? 
acti (Vi2W + Vo"p) 
+ [V'(eiyiz1) + V' (teyoz2) + VV" (a1 + + + 2) W = Ey 


We attempt a solution of this equation by assuming that the term V”’ 
is small compared to the two V’ terms. Then we can apply the 
methods of the perturbation theory. First neglect V’’ and solve the 
equation 


}? 
ae (Vi? + Vo")b + [V' (aiyiz1) + V'(woyoze)W = Ep (191) 


138 SYSTEMS OF MANY PARTICLES [Cu. 5 


From this approximation of order zero we try to proceed to an approxi- 
mation of the first order by taking V” into account. We assume that 
the eigenfunctions y; and the eigenvalues HZ; of the equation 


2 
— vty + Vi (ayey = BY (192) 
2m 
are known. Then, as is easily shown, (191) can be solved by the 
functions 


Win(@1 * + + 22) = We(t1y121) Pe (Toy2z2) 
or, more simply, 


Vik = Wi(1)¥x(2) 


with the eigenvalues H;, = H; + Ey. To prove this we multiply the 
equations 


h® 2 v es 
~ 35 Vi'vi(l) + VDL) = Ew,(1) 


‘ 3 
a his Vo"v(2) + V'(2)yz,(2) = Ex;(2) 


2m 


by ¥;(2) and y;(1) respectively and add, obtaining 
2 
em (Vi? + Vo" )vix + [V'(1) + V'(2) Wik = Evedix 


a relation by which the statement is proved. 
But evidently (191) can be solved just as well by the function 


Ver = Val(eiyr21)Wi(woyoze) = ve(L)¥(2) 


to which the same eigenvalue #,; = H; + H; belongs. Thus the two- 
body problem, in the zero-order approximation here considered, is 
degenerated; for any eigenvalue H,, there exist two eigenfunctions 
wv, and yy; which arise from the possibility of giving two different 
occupations to the stationary states y; and ¥,. Therefore we must 
correlate this solution of the zero order to a system of principal axes 
which, for any eigenvalue Ez, where 7 # k, contains two axes which 
are normal to each other. The space that is scaffolded by the two 
axes contains the directions of all states which do not differ in energy 
but only in the occupation of the 7 and k states. 

From the zero-order approximation we proceed to the solution for 
the system which is perturbed by the interaction V” of the particles. 
The perturbation will nullify the degeneration by establishing two 
normal principal axes in the #,,, space, the eigenvalues of which, 2, + 


Src. 38] SCHROEDINGER’S SOLUTION 139 


ex and Ey, + ¢%’’, are slightly different. Concerning the new eigen- 
values, we can determine the correction terms, ¢,;,, according to Section 
36, by making diagonal the part V” of the perturbation which belongs 
to the Ey, space. Now, according to Section 30, the elements of V’’ 
in Ko are given by 


v1 = | vatVVin dv 12 = [ vie® Vas dv 
(193) 
va = | Vei*V"Vir dv = V29 = / Wei*V" Wei dv 


If in v1 we interchange the variables x,y,z; and xoyeze, the integral 
retains its value since the effect is only another notation of the inte- 
gration variables; the procedure leaves V’”’ and dv unchanged (V” being 
symmetric in the indices 1 and 2) but changes yz into y,;. Thus we 
get V11 = Veg and v2 = v1. According to Section 25 the correction 
terms ¢€;, are now to be determined by the determinant equation 


011 — €% 12 
= 0 


012 022 — ik 


the solutions of which are ¢;, = v1; + ¥12. Therefore the eigenvalues 
of the system are 


Ey’ = Ex + 011 + 012 and Ex!’ = Ex +11 — 012 (194) 


In order to obtain the corresponding eigenfunctions of the perturbed 
system we must transfer to that coordinate system K of the Hy, space 
wherein V” is diagonal. The passage to K changes y,, and y,; into 
two other functions o’ and o”’ which represent vectors in the directions 
of the new axes. o’ ando’’ are connected with the y functions by linear 
transformations 


o = sin + Sieber 0” = Sabin + Sooner 
corresponding to the transformation equation x’ = Sx. According 
to these equations, o’ is a vector whose components, when referred to 
the original system Ko, are $1; and 8,2, whereas o” is the vector with 
components so; and sg. As o’ is transformed by V”’ into e,’o’, we 
have 

011811 + 019812 = ese’S11 = (11 + 12)811 
021811 + V22812 = exe’ S12 = (Y11 + Y12)812 


Both equations give si; — sig = 0, and therefore 811 = si2 = @. 
Similarly se; = —se2 = 8, and thus for o’ and o’’ we have 


o =albix + vei) 0” = Bie — Ves) (195) 


140 SYSTEMS OF MANY PARTICLES [Cu. 5 


Finally, we require that the functions o’ and o”’ as well as ¥,, and y,,; be 
normalized so that | o’*o' dv = | o’*o’’ dv = 1. Then, from (195), 
because of the orthogonality of ¥;z and y,;, we obtain 


2a*a = 28768 = 1 
and therefore 
1 


v2 

We can state the result as follows: In the two-body problem the 
eigenfunctions, because of the perturbation V”, are no longer yz and 
Yes but o” = (1/V2)(vix + ver) and o” = (1/V/2)(Viz — vei), the 
corresponding eigenvalues being Ey,’ = Ej, + v1, + vi2 and By” = 
Ex + 11 — V2. 

39. Symmetric and Antisymmetric Solutions. Let us here 
recall the epistemological principle that, in order to consider the theory 
as having any meaning at all, any of its statements can be tested by 
experiment. This point is important when we apply the theory which 
has been developed to a system in which the particles are indistinguish- 
able. The conclusion was reached that, when we carry out a measure- 
ment of the energy of a system, we shall find the value #;,’ with a 
certain probability and the value E;,’’ with another probability. 
Let us consider first the simple case where only the two eigenvalues of a 
certain HL, space have probabilities that differ from zero. Then the 
state can be described in the form 


u(t, °° * 23) = #o'(a, * > * 22) + xo" (ay * - * 2) (196) 


wherein we designate the components of the representing vector in 
the direction of the principal axes #,;,’ and E;;,"’ by x’ and 2’. The 
decisive point here is that, in the dependence on x1 * - - 22, u*u gives 
the probability of finding particle 1 at Pi(a,y121) and particle 2 at 
Po(xeyoz2). But, on the assumption that the particles are indis- 
tinguishable, all that we can arrive at by experimental means is the 
probability W of finding one of the particles at P; and simultaneously 
the other at Po. When this fact is given by a measurement, it can as 
well mean that 1 is at P; and 2 at P2 as that 1 is at P, and 2 at P,. 
Thus each of these two possibilities has the same probability, W/2, 
and accordingly we can give the following postulate: A theory of the 
many-body problem is acceptable only if it furnishes the same proba- 
bility for the configuration 21y12122y2%2 as it does for royozer1y121 OF, 
in other words, if the probability function |u(xiyrzitey222)|? is sym- 
metric in the indices 1 and 2. If |u(wiyr21z2y220)|” differed from 


Sec. 39] SYMMETRIC AND ANTISYMMETRIC SOLUTIONS 141 


|u(xoyoeer1y121)|”, the theory would hold that one of the configurations 
lat P; and 2 at P2 and 1 at P», and 2 at P; would have a greater probabil- 
ity than the other, a situation that could not be tested by experiment. 

We wish to examine now whether (196) satisfies the above postulate. 
We have 


uu = ga’ *2' (Win *® + vei*) (Wiz + Yui) 
+ aa! *2" (Wie* — Vei*) Wik — Vei) 
+ x! *2! (Win*® — Wei*) (Win + Vai) 
+ ga! *20!" (Win® + Ves*) (Viz — Wei) (197) 


The first two terms remain unchanged when the indices 1 and 2 are 
interchanged, whereas the terms with «’’*x’ and «’*x’’ reverse signs. 
This forces us to impose a supplementary condition on the theory by 
the assumption: There is in reality no state for which both the com- 
ponents x’ and x” differ from zero, but the system is with certainty 
either in the state o’ (x’’ being zero) or in the state o”’ (a’ being zero). 
In both cases the mixed terms with 2’’*x’ and «’*x"’ (which terms make 
u*u non-symmetric) vanish. In what follows, o’ will be called sym- 
metric and o”’ antisymmetric. 

It is easy to see how the considerations are to be generalized when, 
in addition to o’ and o” of the EZ space, we consider also the eigen- 
functions of the other spaces. If, to be more accurate, we denote the 
eigenfunctions by o;,’ and o,’’, any state of the system can be repre- 
sented by 


u(t °° * 22) = bi Liz Tix’ + 2 Te Oi + > Zoi (198) 
ike ikei i 
where o;; represent the function y,(1)y;(2) of the eigenvalues 2H which 
occurs only once. The requirement that u*u be symmetric in 1 and 2 
then leads to the conclusion that either all the z,,’/’ components be 
zero or all the x,/ components together with the x,;; terms vanish. 
The z;; terms must be included with the x,’ ones because ¥;; = 
¥;(1)¥;(2) does not invert the sign when 1 and 2 are interchanged; 
therefore o;; must be considered a symmetric eigenfunction. So we 
may state the theorem: A system of indistinguishable particles is 
observed only in those states in which u is made up of symmetric eigen- 
functions only or of antisymmetric ones only. 

It must be emphasized, however, that this theorem holds only for 
those systems wherein the particles are indistinguishable from each 


142 SYSTEMS OF MANY PARTICLES (Cu. 5 


other. If by experiment a distinction is possible, the unsymmetric u 
have a meaning also. We are then no longer entitled to permit only 
symmetric or antisymmetric eigenfunctions to enter the function u, for 
then there is no contradiction in it being possible for experiment to 
describe a state, for example, by the expression 


’ / ur ur 
U = Lip Cin + Lik Cx 


with x,,' and x,;,/’ ~ 0. Although it is true that this implies different 
probabilities for the configurations P;P2 and P2P;, these configurations 
can indeed be distinguished experimentally, so that there is reason in 
speaking of the states Yi, and ¥%;. A state corresponds to y,, if an 
energy measurement carried out on both the particles separately gives 
the values E; and E; for the first and second particle respectively, 
whereas the inverse correlation of the same energy values indicates 
a state Wx:. 

It is interesting to consider from another point of view a system 
with two distinguishable particles. Let us assume that we know the 
system to be in a certain state yj, at time?. In order to find the vector 
representing the state we must substitute y,, for win (198). Then we 
see that the equation, in order to be fulfilled, requires that all the z 
components that do not belong to the EZ, space be zero, whereas 
ri! = tx!’ = 1/+/2, since we then obtain 


Vik = (Viz + Wes) + Fie — Ves) 


It is a question now of what will become of the state oy a6 
1/+/2 (omitting the subscripts and indicating by the index 0 the cor- 
relation to ¢ = 0) when the system is left alone. This question is 
answered by the dynamical law, (h/7)(dx/dt) = Hx, from which it 
follows that for the time-dependence of the components x’ and 2’ 
we have 


, G/M) EB yy!t 


x’ = xole Mg GAM Bgyit 


gl = 2 
and therefore u, in its time-dependence, is 
ular © + + 22, t) = F(din + vade@™*" + Edin — Vase *n"t 
= dp (e@PFa't + oO Au"t) 
+ bp (e/MEwt — Meu") hee 


This means that the probability of finding the system in the state 
Vi, at time ¢ is 


Sxc. 40] EXCLUSION PRINCIPLE 143 


= (ome + ge ral) (a ent + oP Fa") 


Vs 1 
= rik + 2 cos i (Ex’ — Ba] 
= cos” on (Ex’ — Ex!’)t = cos? a} (200) 
Qh tk tk rm 


whereas the probability of the state ¥,; is sin? (v12/h)t. Hence the 
probabilities of ¥;, and y;,; change periodically between 0 and 1, having 
a period tr = h/2v19. If at ¢ = 0 the system is with certainty in the 
state yx, then at ¢ = h/4v >. it will with certainty be in the state y;,. 
This phenomenon can be described as a “place exchange’”’ between 
the particles, the word “place” meaning the coordination of the 
particles to a certain stationary state. At times an interpretation 
has been suggested that the particles change positions once within the 
interval h/4v12. Such an interpretation is, however, not quite correct. 
It must not be assumed that the exchange takes place in a discontinu- 
ous way, but rather that the transition from y;, to ;; is the continuous 
process described by (199). According to this description the system, 
in the time h/4v12, passes through a sequence of states which are not 
defined with respect to the positions of the particles. When the system is 
in a state for which neither x’ nor x’’ is zero, it would be incorrect to 
say that it is in the state y,;, with the probability z’*z’ and in the state 
zi With the probability «’’*x’’, because, actually, it can only be 
maintained that a measurement of the energies, made on the two 
particles, provides the results y;, and ¥;; with the stated probabilities. 
But we destroy the state in question by the measurement, forcing the 
system to assume one of the states ¥;, and y,;. And the meaning of 
z'*z' and x’’*x"’ is that they determine the probabilities of the system 
favoring Wik or Whi- 

40. The Exclusion Principle Relative toa Combination of Sym- 
metric and Antisymmetric States. We return here to systems of 
indistinguishable particles. As we have seen, such systems can exist 
only in states that are represented by a sum of symmetric eigenfunc- 
tions only or antisymmetric functions only. However, in order to 
be sure that this assumption complies with a consistent theory, we 
have to examine the change which a system undergoes according to 
the law (h/i)(dx/dt) = Hx. The question is whether it could not 
be possible that a system, starting from an admissible state at t = 0, 
changes in such a way that it occupies states in which symmetric and 
antisymmetric components are mixed. It is seen at once that for an 


144 SYSTEMS OF MANY PARTICLES (Cu. 5 


unperturbed system this is impossible, for, since x = zoe", a 
component will remain zero if it is zero at t= 0. However, we can 
prove also that by the emission or absorption of light a transition from 
a symmetric into an antisymmetric state or its reverse cannot be induced. 
To show this we take into consideration the fact that according to 
Section 20 the probability of a transition from the symmetric state 
ox’ into the antisymmetric state oim’’ depends on the integral 


| (a1 + 22)o%%' *o1m"’ dv (201) 


the integral being extended over the six-dimensional configuration 
space. From the correspondence principle it is readily understood 
that now we must substitute the sum x; + 22 for 2; for, in the case of 
two particles, the expectation value of the electric moment is given by 


the integral e if (2, + 22)u*u dv which when resolved furnishes the 
term 


Dep! * Lyn 'eU™ Fim! —* te if (x4 + Lo) osK *o1m"’ dv 


If the integral in this expression is zero, the quantum-mechanical 
interpretation is that a transition ox)’ <= c@m)"’ never occurs, either in 
the direction in which the transition is connected with the emission of 
light or in the other direction, which leads to absorption. In this case, 
expression (201) is truly zero, for the expression remains the same when 
the indices 1 and 2 are interchanged, the interchange meaning only 
another notation of the integration variables. Thus 


[Gi + 22)oin’*(1, 2)orm’(1, 2) do 
= [ (es + aiow!*(21)oim(21) do 


Because oj4'/(12) = oj’ (21) and ojm’’(12) = —oim’’ (21) this can be 
true only if the integral is zero. 

Accordingly we come to the conclusion that a system which is in a 
symmetric or an antisymmetric state will retain its character for the 
entire future. That is all that we may maintain if quantum mechanics 
is true. The further question is whether there are, for different sam- 
ples of the same system, both symmetric and antisymmetric states, or 
is only one of the two types realized and, if so, which of these can be 
determined by experiment only. As we shall see later on, it happens 
that the character depends on the spin of the particles and is symmetric 
or antisymmetric depending on whether the spin is an integer or half- 
integer multiple of h. 


——————— 


Sxc. 41] THE HELIUM ATOM 145 


41. The Helium Atom. Asan illustration we consider the neutral 
He atom. For a long time it has been known that the He spectrum 
corresponds to two systems of terms which never (more precisely, very 
seldom) combine and which are called ortho and para terms. There 
is no doubt about how this fact is to be interpreted: The two electrons 
of the atom must in some way be distinguishable, the mark of distinc- 
tion being the direction of spin. The function u, which represents the 
state of the atom, is therefore composed of symmetric and antisym- 
metric eigenfunctions which are to be identified with the ortho and 
para terms. That there actually is a weak combination between the 
two types of terms is due to the fact that, according to the rigorous 
theory, the integrals (201) are not exactly zero but only very small. 

To carry out this idea we must find out first how the two classes of 
states and the two observed types of terms must be coordinated. For 
this purpose we compare the ground states of the para and ortho series. 
The experimental evidence shows that the lowest energy level (denoted 
by 1S) of the para type lies far below the lowest ortho level, 2s. This 
fact suggests the following interpretation: In the zero-order approxi- 
mation each of the two electrons moves in the same way as it would in 
an H atom of which the nucleus is doubly charged. According to 
the theory of the H atom, the lowest level will, therefore, come about 
if both the electrons are in the ground state, n = 1, m = 1 = 0 (ef. 
Section 18). If we designate by ~; the eigenfunction of the H atom 
which corresponds to this state (according to Section 18 it would be 
more precise to write ¥100), the lowest energy term of He corresponds 
to the state o11 = ¥1(1)¥1(2), which state belongs to the symmetric 
class, and so we must infer that the para levels are to be coordinated to 
the symmetric and the ortho levels to the antisymmetric states. This 
conclusion is confirmed by an examination of the next lowest energy 
levels which must arise when one of the electrons is moving on the 
orbit n = 1,1 = m = 0, and the other on the orbit n = 2,1 = m = 0. 
Then the corresponding eigenfunctions are 


1 
(12) = —= Wy (1)o(2) + Wi(2)po(1) 
Le (202) 
o(12) = Vo V1Dval2) — ¥1(2)P2(1) 


of which, if our interpretation is correct, the first represents a para 
and the second an ortho state. As the energies of the two states have 
nearly the same value, the first ortho S term should coincide approxi- 
mately with the second para S term, and they actually do. From the 


146 SYSTEMS OF MANY PARTICLES (Cu. 5 


observed spectrum it is to be inferred that the para term must lie a 
little higher than the ortho term. This conclusion is also in agree- 
ment with the theory according to which the energies of the symmetric 
and antisymmetric eigenfunctions are 


Ey)! = Ei + Ee +011 + r12 and = Bjy9"’ = Ey + Be + 011 — M12 


respectively. Thus the difference is 2v12, for which the theory gives a 
positive value. 

There is, however, still another point to be explained: The term 
arrangement of para helium is a triplet system, whereas the ortho terms 
are single. The spin of the electron must be taken into account to 
explain this fact. The electron has an angular momentum of its own. 
When the component of this momentum in an arbitrary direction z is 
measured, the result is either +//2 or —h/2, the probabilities of 
either depending on the internal state of the electron. Accordingly 
this state can be described by a function x(m), the argument m being 
capable only of the values +34 and —}4. x(+}9) determines by 
Ix(+4)|? the probability of finding the value +h/2 for the z compo- 
nent of spin, whereas x(—14) belongs to —h/2. 

The function x(m) can be pictured as a unit vector x which lies in a 
two-dimensional ‘‘spin space” and which, when referred to a cer- 
tain normal coordinate system K, has the components 2; = x(+)4) 
and t» = x(—14). When the vector x has the direction of one of the 
two axes (in this case x may be designated by x? and x” respectively), 
it signifies a state in which the values +//2 and —h/2 respectively for 
the z component of the angular momentum are found with certainty. 
The components of x* are 2; = x(+)4) = 1 and a2 = x(—}4) = 0, 
and those of x~ are x(+}4) = 21 = 0 and 22 = x(—}4) =1. Asxt 
and x” are orthogonal, the product (x*x7) = 0. 

We can now represent the state of an electron with energy E by 
the product of two vectors one of which, when resolved relative to the 
principal axes belonging to the xyz coordinates of the particle, is given 
by ¥;(zyz) and refers to the position of the electron, while the other 
x(m) corresponds to the spin. |Wi(ayz)x(m)|* is the probability that a 
simultaneous measurement of position and spin will give the results 
xyz and m. ¥:(zyz)x* and y,(zyz)x~ denote states in which m is 
found with certainty to be +h/2 and —h/2 respectively. No diffi- 
culty is encountered in applying this formalism to the case of two 
particles. We must consider, for example, 


[Vi(L)Wu(2) + ve(2)¥e(1)]x* (1)x* (2) 


a representation of a state in which the electrons belong to the 7 


Sec. 41] THE HELIUM ATOM 147 


and k levels respectively, whereas the z component of spin has the 
value +h/2 for both electrons. When the spin is +//2 for one elec- 
tron and —h/2.for the other, we must take x*(1)x~(2) + x*(2)x7(1) 
as a factor of o;’. This leads to eight different states in all, four of 
which are symmetric and four antisymmetric. The first are given by 


oin'x* (1)xt(2) 
oin'x (1)x” (2) 


203 
ox [x*(1)x" (2) + x*(2)x~(1)] ail 
oun [x*(1)x7(2) — x*(2)x7(1)] 

The other four are given by 
oin!’x*(1)x*(2) 
“a Xx (Lx, (2 
ain x (1)x (2) (203’) 


oun’ [x*(1)x~(2) + x*(2)x7(1)] 
oie [x* (1)x~(2) — x*(2)x~(1)] 


All eight states are orthogonal to each other, as is seen from the product 
(xtx-) = 0. When 1 and 2 are interchanged, the first four remain 
unaltered but the latter four reverse signs. 

Now we can readily understand the peculiarities of the helium spec- 
trum. First of all, we know that the probability for a transition 
between a symmetric and an antisymmetric state vanishes; the proba- 
bility differs from zero only for transitions within the same group. 
Thus the exclusion principle is not to be applied to the oj,’ and oj,’’ 
functions but to the whole of the solutions (203) and (203’). This 
means that within the antisymmetric group (as will be seen, this is the 
only group realized in nature) the fourth solution may be combined 
with the other three, so that we can now understand how transitions 
between o,;,’ and o;,’’ are possible in helium. The reason is that o;;’ 
and o,;’’ are not, by themselves, the eigenfunctions of the system but 
have to be connected with the spin function x. Therefore we cannot 
deduce the character of the state from the character of ¢. The para 
function o’ may describe an antisymmetric state, and the anti- 
symmetric ortho function o;;,’’ may describe a symmetric one. 

It is to be seen further that, if all eight eigenfunctions of (203) and 
(203’) were realized, the theory would require that four different ortho 
and as many para terms occur in nature. Actually, however, only 
three terms of the first kind and one of the second exist. From this 
fact it must be inferred that, of the eight solutions, (203) and (203’), 


148 SYSTEMS OF MANY PARTICLES [Cu. 5 


only the four of the antisymmetric group are realized. As three of these 
are of the ortho and one of the para type, our assumption (correspond- 
ing to the rule formulated at the end of the preceding section) leads 
just to the observed multiplicity of terms. However, the difference 
of the three ortho terms is small, owing to the weak magnetic inter- 
action of the electrons caused by spin. This triplet structure of the 
ortho system was discovered after Heisenberg had predicted it 
theoretically. 

Among the para terms, that of the ground state is of particular 
importance because it measures the ionization potential of the atom. 
The fact that Kellner and Hylleraas obtained for the potential a value 
that differs from the experimental one only by a few parts per thousand 
verifies the accuracy of the quantum-mechanical He model. 

42. Systems of Many Similar Particles. Method of Particle 
Picture. At this point we are going to generalize our considerations 
by applying them to systems that contain an arbitrary number n of 
particles. Asan example, we might take an atom the nucleus of which 
is surrounded by more than two electrons, but we could just as well 
consider a gas, regardless of whether its particles are material cor- 
puscles or light quanta. However, we shall simplify the problem by 
assuming that an eventual interaction of the particles can be dis- 
regarded. In addition we assume that the mechanical nature of 
every particle is described by the same Hamiltonian H, so that the 
vector x of any particle satisfies the equation (h/i)(dx/dt) = Hx. 
No special assumptions are made regarding the kind of function, so 
that for H too there may be substituted energy operators that corre- 
spond to the relativistic wave equations of the next chapter. 

At first we imagine that the particles are distinguishable and marked 
1,2, °° + ,m. To avoid confusion we must use a carefully designed 
notation. For any particle there is a certain number of observables 
q which can be measured simultaneously and which have eigenspectra 
that may be assumed to be discrete. We denote by Ko the coordinate 
system in which all the (commutative) q terms are diagonal. Any 
axis of Ko corresponds to certain g values in the sense that a vector x 
the direction of which coincides with the axis is transformed by the 
operator Q, which corresponds to the coordinate g, into ax; the factor 
a is the eigenvalue of Q belonging to the considered axis. Any axis of 
Ko may, therefore, be characterized by the totality of the corresponding 
eigenvalues. We simplify the notation by labeling the axes by 
the numbers k = 1, 2, ° ~:~: and setting down for any k the corre- 
sponding eigenvalues a. Then we may denote by x; a unit vector which 
has the direction of the kth axis and indicate by an upper index 7, for 


Sc. 42] MANY PARTICLES. PARTICLE PICTURE 149 


example x;', that the vector is meant to represent the state of the ith 
particle. 

We pass now from the single particle to the whole system. We 
extend the system Ko of the single particle to a system K, which com- 
bines n Ky)“ systems in such a way that any axis of K, defines a state 
k, of the first particle, a state ka of the second, and so on to the state 
k,, of the nth particle. This means that any unit vector X, defined by 
an axis of Kj, is a productt of n unit vectors, xz,°P xz." + * + xEq”; 
that is, 

, = XE, PO xK. ieee xz,” (204) 


Thus a unit vector X of the kind that represents the direction of the 
axis ky, ke, - + + , kn of Ky represents a total state in which the first 
particle is in the state k,, the second in ke, and so on. 

When we apply any permutation to the upper or lower indices in 
(204) we get another state that belongs to the single states ki, ke, 
. , ky, as well, differing from (204) only in the coordination of these 


states to the particles. If in the sequence of indices ky, ke, +++ , kn 
(which need not differ one from the other) the value 1 occurs nj 
times, the value 2 nz times, and so on, there are n!/ni!no! - - - differ- 
ent states X;, Xo, - : - all of which belong to the same single states 
ky, ko, * + * kn. Thus any state of the kind ky, ko, - : + , kn can be 
represented in the form 

i = ayXy + ayXo + eh (205) 
where |a,|? represents the probability that the states ki, ko, - °° , kn 


are coordinated to the particles in the manner required by X;. 

Now let us assume that the particles are indistinguishable. Then 
(205) has a meaning only if all the probabilities |a;|? have the same 
value. Otherwise, in contradiction to the possibilities of observation, 
a certain individual coordination of the k states to the particles would 
be maintained. We now learn from experience that in nature the 
condition that |a,;|? be a constant is realized in two different ways: 


(i) For particles the spin of which is either zero or an integral multiple 
of hi, all the a; terms have the same value; consequently the only 
‘states possible are those of the sort, 


=a(Xit+X2+-::) (206) 


+ Only a product and not a sum can be considered, for our formalism requires 
that the component of X in the direction of a Ky axis defines by the square of its 
magnitude the probability of finding the first particle in k1, the second in ke, and 
soon. However, this probability equals the product and not the sum of the single 
probabilities. 


150 SYSTEMS OF MANY PARTICLES [Cu. 5 


The principle of symmetric states then holds for the system, and the 
statistics of Bose-Einstein must be applied. 

(ii) For particles with a half-integer spin, a will be positive or 
negative depending on whether X; results from X1 = x,°xj."”, + + ° 
with k, < ky <k3 < +++ by an even or odd number of soteanitie 
tions applied to the k;. For such particles X is given by 


X = a(Xi —X2+Xs-—°:-:) (207) 


In this case the principle of antisymmetric states holds and the Fermi- 
Dirac statistics must be applied. X may be represented by the 
determinant 


x san) (Gite tis! © fei) «ips ESweiielroier «oye (207’) 


at of ellie tele al SGTS yrs ta Fe 


The factor a in (206) and (207) is to be chosen so that iis = ft 
the X; are already normalized—that is, if (X;X,) = 6;,—then, in 
the case represented by (206), from [x| 2 = 1 we obtain the equation 
lal? (n!/nyno! + + + ) = 1 and therefore a = V n;!ne! -/n!. On 
the other hand, because of (207’), an antisymmetric X is mabe only 
if all the k; states differ, so that X is composed of n! terms and a is 
given by a = 1/V nl. 

Thus we cannot characterize a state of the kind (206) or (207) by 
specifying the state k; for every individual particle but can give only 
the description that n; particles (we do not know which) are in the state 
1, n2 particles in the state 2 and so on. The n; must, of course, 


satisfy the condition that > n; = n and, for an antisymmetric system, 


can have only the values 1 and 0. 

Once again it becomes necessary to change the coordinate system in 
the Hilbert space. We have extended the original system Ko, which 
was adapted to a single particle, to a system K,, which was made 
sufficient for n scale by providing an axis for any portant of 


states k1, ke, - + * , kn coordinated to the particles 1, 2, puts 
But, since the particles are indistinguishable, the totaly of all states 
resulting from ky, ke, + * , kn by applying a permutation to the k; 


terms now reduces to a sag state, (206) or (207), and this fact forces 
us to simplify K, to a system K wherein the axes are characterized by 
an infinite sequence of numbers 71, v2, + * . This has to be under- 
stood in the sense that the axis n1, 2, * * * corresponds to a state in 


Sxc. 42] MANY PARTICLES. PARTICLE PICTURE 151 


which 7 particles are in state 1, n2 particles in state 2, and soon. If 
we denote the respective unit vectors by Xnjin...., any state of the 
total system can be described by coordinates rn,n....in the form 


x = M Lashes ok nenaene (208) 


nina*** 


and — i Nig gives the probability that a measurement, carried out on 
all particles, finds n; of them in state 1, ne of them in state 2, and so 
on. The components 2p,n2... of the vector must now be marked by a 
sequence of numbers nj, m2, * * + the whole of which now replaces 
the index 7. 

The representation expressed by (208) includes all statements that 
can be made in quantum mechanics about the possible states of a 
system of indistinguishable particles, and the only thing we must yet 
investigate is the law according to which the vector X changes with 
time. We know this law to be 


>~— = HX (209) 


where H denotes the Hamiltonian operator of the total system. H 
is a matrix the elements of which, when referred to the coordinate 
system K, are to be written in the form Hajny...ny'nz..., 80 that from 
(209) we obtain 


hi dining... 


; dt = RU mec ctastaw canta cc (209)’ 


nine! + 


The question here is how to determine H from the operator H which 
holds for a single particle. This question will be answered first for the 
symmetric system. In this case the probability of finding the first 
particle in the state ky, the second in ke, and so on is given by |x,,"??, 
\txo°?|? - + + . The state considered is of the kind m1, no, « - - if in 
the sequence ky, ko, + + - , ky the values 1, 2, - - - occur ny, m2, °° - 
times respectively. As the probability of another distribution of the 
same single states has the same value and the number of distributions 
is n!/ny!n2! - - - , we have 
n! Co) eee RC 


NE oes? |?laes 


ii N1!no! ba 


n\ : 
Lning--- i ———— 2,2, Say te rp, (210) 
n!ng! spats ae: 


Therefore 


152 SYSTEMS OF MANY PARTICLES [Cu. 5 


Recalling that for a single particle 
h te. 
mp Hapts™ (211) 


a 


we obtain, upon differentiating (210) relative to time, 


& “in es 
ae Lac OOPS 


de Y BV sida + Wit nasa cad eciindia 280812) 
BAa : 
Equation (212) is arrived at in the following way: First we take those 
factors in the product (210) for which k; has an arbitrarily given value 
a. According to the definition of the numbers ni, m2, °° ~* , the 
number of these factors isn. When we differentiate the factors rela- 
tive to time, we obtain 

(i) because of the terms Hate in (211), the product (210) multi- 
plied n, times by Hoa: HaaMe%ninz-... This explains the first term on 
the right-hand side of (212). 

(ii) because of the term H, x, a product in which the factor x, no 
longer occurs n. times but n, — 1 times, whereas the number of 2; 
terms, with k; = 8, is increased from ng to ng + 1, that is, a product 
which, according to (210), belongs to tnjne...na—1---ng+1---- As the 
factor of the product we find Hagna Vn!/nilng! +++. If we take 
n, into the root, one factor n, is cancelled by the last factor of n,! in 
the denominator, thus reducing it to (ng —1)!. If, in addition, we 
multiply numerator and denominator by ng + 1, the factor ng! of 
the denominator is increased to (ng + 1)! and we obtain 


n!na(nmg + 1) 
mine! +: + (m2 — 1)! +--+ (ns + 1)! 


as afactor. But this is Vn_(ng + 1) times the factor which, accord- 
ing to (210), belongs to %nino--.na—1---ng+1---- Thus, because of Hag, 


we obtain the term Hus V na(ng + 1) Cnine-.-na—1-++npfls-++ 
On comparing (212) with (209’) we now see that the diagonal ele- 
ments of H are given by 


Hains... nine... = ), taHea (213) 


For any n, term in this sum, that value is to be taken which occurs in 
the diagonal element Hyng...ning-.. 28 index. On the other hand, 


Sxc. 42] MANY PARTICLES. PARTICLE PICTURE 153 


the non-diagonal elements are 


Hivcass-+ atadass 2 ae Vv Nalite 1). or 0 (214) 


the first value holding for nn,’ = nz — 1, ng’ = ng + 1, with all the 
other 7,’ terms being equal to the n;. In all other cases the second 
value holds. 

In a quite similar way H can be evaluated for an antisymmetric 
system. Here we shall be satisfied with the result only. For the 
diagonal elements of H, (213) is again valid but the non-diagonal ele- 
ments are given by 


Hisaias ws eataatess = + Hestea or 0 


The value + Hs holds for nz = 1, n2’ = 0, ng = 0, ng’ = 1, all the other 
n; being equal to the n;. The positive or negative value is taken 
depending on whether those n; terms between n, and ng have the value 
unity an even or an odd number of times. 

There remains to be considered the case wherein a continuous spec- 
trum of eigenvalues belongs to the simultaneously measurable coor- 
dinates q1, Y2, - * * , Ys by which the state of a single particle is defined. 
Then the state x of the particle can be described by a function ¥(qi °°: - 
qs), Which by \y| "dq, - + + dqs gives the probability of finding for 
the qg;, when they are measured simultaneously, values between q; 
and q; + dq;. In what follows we shall use ~(q) to represent (qi - - - 
qs), Wherein qg is the single variable that includes all the q; terms. 
Now let us assume that there are n particles the states of which are 
described by ¥1(q)¥2(q) - - > , the ; being either identical or different 
functions. If we denote by g™ the variable q which belongs to the 
ith particle, the probability that a measurement for the first particle 
gives a value gq", for the second particle a value g‘, and so on, is 
given by the square of the magnitude of the function 


wg'g™ «+ = g™) = wg alg) «> - vag) 


When in this function the indices of two y; functions, say 1 and 2, 
are interchanged, we again obtain a function of the g“ which, how- 
ever, gives a different probability for the result g‘?q® - +--+ q™. 
This state differs from the preceding one in that the particles 1 and 2 
have interchanged their states y; and y2. However, since the particles 
are indistinguishable, we can never judge which of the particles belongs 
to which y;, so that only those W(¢Vq¢ + + + q™) have meaning which 
are symmetric or antisymmetric in the gq terms. This requirement is 
fulfilled by 


w = Y valgvo(Q) + + + va@™) 
Ys 


154 SYSTEMS OF MANY PARTICLES [Cu. 5 


the sum being understood in the sense that any permutation is applied 
to the indices of the y; and that the terms are taken either all with the 
same sign or positive and negative alternately. 

In order to calculate H, the operator that regulates the time change 
of © by (h/i)(dW/dt) = HW, we start with the y; functions. If H; is 
the energy operator for the y; state, we have 


d 
had (g) = Hbdq™) 
2 dt 


Thus it follows that 


- us y ey 2(q (2)) . onal ye (q‘?) Sa va se twee 
& 


After the permutations have been applied, the first term gives (¢/h)H 1, 
and so we obtain 


oe = § AW or H= yy (215) 
i dt : 


43. Systems of Many Similar Particles. Method of Wave 
Picture. In the preceding section on the treatment of the many- 
body problem a method was applied in which we used the concepts of 
the particle picture only, explaining the state of the total system from 
the states of its individual particles. According to quantum mechan- 
ics, however, the behavior of a particle can also be interpreted by the 
picture of a wave motion, and this leads to another (mathematically 
equivalent) solution of the many-body problem, a method consisting 
essentially of a quantization of the wave motion that corresponds to a 
single particle. 

To develop the method we do not consider a whole system, but 
rather a single particle, again assuming a discrete eigenspectrum of 
the observables q which, we imagine, characterize the state of the 
particle. In the Hilbert space, the coordinate system denoted by 
Ko in Section 42 is then scaffolded by the unit vectors x; and any 


state x can be represented in the form x = > exe. If we resolve x 


and x; relative to the axes of the coordinate system belonging to the 
coordinates xyz of the particle, x and x; become associated with the 
functions ¥(xyzt) and y;(ryz), the relation being given by 


Vayet) = ) xe(ve(oye) (216) 
k 


Suc. 43] MANY-BODY PROBLEM. WAVE PICTURE 155 


The coordinates x, are written as functions of ¢ because they change 


with time according to 
h dz, oe: ye 
ore ; Aye; (217) 


where ||H;;|| is the energy matrix of a single particle referred to Ko. 

If the function y(zyzt) is interpreted as an excitation of some kind or 
other, y represents a wave motion the wave surfaces of which at time 
to are given by ¥(xyzto) = constant. Thus the meaning of equations 
(216) and (217) is that the wave can be resolved into parts ¥;(2yz) 
which appear in the total wave with intensities given by the 2;(¢) 
factors. We shall show now that the quantization of the wave 
motion leads to a formalism that is closely related to the solution of 
the many-body problem. 

If the ¥;(xyz) functions are given, the wave motion evidently can 
be described by the time-dependence of the 2, terms; hence these can 
be considered generalized coordinates of the system. If we wish to 
apply the methods of quantum mechanics to the system, it will be 
necessary to give equations (217) the form of canonical equations by 
deriving them from the Hamiltonian H. If H is known as a function 
of the a; terms and the conjugate momenta pz, we can carry out the 
quantization by translating x; and pz into matrices chosen in such a 
way that they satisfy the commutation relation. 

We begin by determining the function H. The condition H must 
satisfy is that equation (217) be identical with 


dx, oH dp, aH 


dt = apn Ut = = a, (218) 
We can achieve this by choosing H to be 
{ao ; c A ycep: (219) 


For from (218) we get 


“ d } t 
“Fe iD, Hee oh = - 1) Hap: = — i) map, 


The second equation becomes the conjugate complex of (217) if we put 
= (h/i)a,*. Thus we obtain 


ea > Hines" (220) 


156 SYSTEMS OF MANY PARTICLES [Cu. 5 


The next step in the procedure is to translate the z coordinates and the 
momenta p into matrices. This means that we must subject the 
already quantized equations (216) and (217) to a second quantization, 
for which we need a new Hilbert space equipped with a coordinate 
system K, this system having no relation to Ko of the preceding dis- 
cussions. If we denote the matrix belonging to x, by X;z, the matrix 
corresponding to 2,* is the Hermitean conjugate matrix X;,; for 2, + 
£, is real and thus the corresponding matrix must be Hermitean, this 
condition being satisfied only by X; + X;. The coordinate and 
momentum, therefore, are to be represented by X; and (h/7)X,, which 
are required to fulfill the relations 


XX, — X,X; = Bon 
X;X_ — X~Xi = 0 (220) 
XX), — XX; = 0 
We obtain for H, using the matrices X; and X,, 
how > HakX (221) 
if 


The coordinate system K of the Hilbert space to which the matrices 
X; and X; are referred can be chosen arbitrarily, and we choose it in 
such a way that X;X; becomes diagonal. Then a certain eigenvalue of 
X;X; can be correlated to any axis of K for any i value. If we imagine 
that the eigenvalues of X¥;X; are marked by 1, 2, - - - , any axis of 
K can be designated by 7, v2, - + * , meaning that the axis belongs to 
the nth eigenvalue of X,X,, to the noth eigenvalue of XX», and so on. 
It is easily seen that the solutions of (220) satisfying the condition 
that X;X; be diagonal are given byt 


o” iment =Vn +1 BO ua — V ni (222) 


with all the other elements being zero. 


} If a matrix A is referred to a coordinate system K the axes of which are marked 
by several indices, the elements of A must be written in the form @nyny.-- ny'no’ +--+ 
The multiplication rule then requires that the elements of the product AB be 
defined by 


(AB) nins---ni'ns sss = > Gnins-++n1''ng!’ +++ Oni!ng” «+s na'na! «++ 


this sum to be extended over all combinations n1/’n2"’ - - - . Forexample, in (223) 
SU Zines > nies swilnnt<s aati s « = O CXCEDL Bains < sms pings s seg =< 5 ANG BUEN 
Ba apes ons ill annmingss tyre BVO OROODU Waina«> s'Cngeel) =. aiaaessar sss 


Sxc. 43] MANY-BODY PROBLEM. WAVE PICTURE 157 


For the sake of simplicity the elements x of X are marked only by 
those n;, n;’ indices which are different, so that x n;.n;41 is an abbre- 
viation of  ning...nj--.ming---ni¢1---- From (222) we get forthe prod- 
uct X;X; the diagonal elements | 


Ge ae = Ted (223) 


whereas all non-diagonal elements are zero. In numbering the axes of 
K we may, therefore, use the eigenvalues of X;X; directly. 

Let us now denote by X the vector that represents the state of the 
system, and let its components relative to the ni, m2, - * - axes of 
K be Zninz-..- If X has the direction of one of these axes, the product 
a;*x; = |x,|? is given by that eigenvalue of X;X; which belongs to the 
axis, that is, by a whole number n;. |r|? is the square of the ampli- 
tude of the 7th partial wave and thus measures its intensity, so that 
X indicates a wave motion in which the first, second, and so on partial 
wave appear in strengths n, m2, -*-*. If X does not coincide with 
an axis of K, the component Zn; n...., by its square bes ug.2.? deter- 
mines the probability that the measurement of the partial waves finds 
the strengths ni, ne, - - - respectively. 


For the elements of the Hamiltonian H = ) Hak iX%, we obtain 
now 


Piascnesicin nae. = ), Haste 
Hains:-- aitnd--- = Ha V Nalng +A) .6n,.9 


The first value holds for n./ = n. — 1, ng’ = ng + 1, all the other 
n;’ terms being equal to the n;. The second value holds in all other 
cases. Thus we arrive at the important result: The operator H, which, 
according to (h/i)(dX/dt) = HX, regulates the time rate of change of the 
wave motion, turns out to be identical with the operator that belongs to a 
symmetric system of particles according to (218) and (214). Thus ina 
system of particles which is capable only of symmetric states, the 
quantization of the wave motion, carried out with the aid of the com- 
mutation relation (220), leads to the same formalism as the quantiza- 
tion of the particle picture. Although quite different in content, the 
two methods are mathematically equivalent, so that the many-body 
problem may be treated just as well by the wave picture as by the 
particle picture. In order to return to the particle picture, we need 
only apply a re-interpretation by considering the square of the ampli- 
tude |2,|* of the ith partial wave as the number n; of the particles 
which are in the 7th state. 


(224) 


158 SYSTEMS OF MANY PARTICLES [Cu. 5 


The wave picture method may also be applied to systems observed 
only in antisymmetric states, the difference merely being that the 
relations 


XX,+ XX; = Fin XiXe+XX:=0 FX, 4+ XX; =0 


must be substituted for (220). This case will not be considered in 
detail here. 
It should be noticed that a Hamiltonian operator of the form 


>, Fe O.8 permits the system to be changed in time only in such a 


way that the number of particles remains constant. For it follows 
from (h/i)(dX/dt) = HX that, if we resolve the vector X relative to 
the axes of K, 


hi d&nins... ft 


- dt Hx(X:X 3) nina s+ eny'nal-++Uny'na’..- (225) 


tkni'na’ +++ 


|tning-..|? is the probability of finding n; particles in state 1, no in 
state 2, and so on. Now let us assume that at to the system is in the 
state ny°, n2° +++. Then, on the right-hand side of (225), only 
terms with ny’ = 1°, no’ = ne° and so on will occur, since only the 
Ini n°... term is not equal to zero. However, according to (222), 
(X;Xxk)nine---ny'ny--. differs from zero only for n; = n° +1, nm, = 
m° — 1, all other n, terms equaling the n,°. This means that, because 
of Hi,(X;X;), only such components can be had wherein n; + ny + 
++ =my°+n29++-+, the only effect of (X;X;) being that the 
occupation number of the 7th state is increased by 1, whereas that of 
the kth state is diminished by 1. 

Later on we shall come across systems the Hamiltonian of which 
contains terms with X; only or X; only. Such terms signify the pos- 
sibility that particles are created or annihilated. 

The equivalence of a quantized wave motion with a many-particle 
system is of decisive importance for the theory of wave fields. Our 
considerations have been concerned with a special wave field cor- 
responding to equations (216) and (217). However, the method can 
be generalized on a large scale. If an arbitrary wave field is given 
whose excitation, in so far as it depends on 2, y, 2, t, is defined by 
¥(xyzt), then y, with the help of a system of orthogonal functions 


f:(zyz), can always be expanded into a series, »: 2x;(t)f;(wyz), and in the 


coefficients x; of the expansion we obtain generalized coordinates the 
values of which define the state of the system for any time t. We can 
correlate certain momenta p; to the 2; terms. If the energy of the 


Sxc. 44] BOSE AND FERMI STATISTICS 159 


field is expressed in terms of 2; and p;, we obtain the Hamiltonian H 
from which the field equations can be derived in the form 
dx; oH dp; oH 


dt ap; dt ax; 


If we succeed in selecting x; in such a way that H takes on the form 
ps H«;*x,, then, according to the theorem proved, the quantized 


wave field is equivalent to a system of particles; this means that the 
measurement of all observables that can be ascertained simultaneously 
with the |.;|? terms leads to the same result as if the wave field were 
a system of particles. Later on we shall adopt this method in the 
treatment of all wave fields, especially those corresponding to the 
light quantum and the meson. 

44, Statistics of Bose-Einstein and Fermi-Dirac. According 
to quantum mechanics only symmetric or antisymmetric states for a 
gas are allowable. Any of these states X is characterized by the 
numbers 71, m2, * * * of the particles that are in the single states 
1,2, --* +. Because the particles are indistinguishable, the question 
as to which of them belong to what states is meaningless. This 
situation necessitates a drastic departure from ordinary ideas in judg- 
ing the a priori probability of a state. Classical physics would have 
estimated the probability from the number of different possibilities of 
coordinating to the particles the single states of which the total state 
is composed. In contradiction to this method, quantum mechanics 
gives to any symmetric or antisymmetric state, represented by a 
vector X in the Hilbert space, the same weight. As far as quantum 
mechanics is concerned, it is irrelevant, for example, that a total state 
composed of different single states can be realized in n! different ways, 
whereas in the case of n identical states there is only one method of 
realization. This is no reason for quantum mechanics giving the first 
state a greater weight than the second, because the two states repre- 
sented by a unit vector in the isotropic Hilbert space are considered to 
be perfectly equivalent. 

To clarify this important difference between classical and quantum 
mechanics, we consider the simple case where two particles are present, 
the possible states of which can be given by a set of x, vectors. Classi- 
cal statistics would look at this as follows: When we investigate the 
particles at time ¢, we find either two identical or two different xz, 
states. A total state of the first kind, X = x,‘?x,”, can be realized 
in only one way, since an interchange of the two particles leaves X 
unchanged. For the states X = x;‘)x;, the weight is therefore to 


160 SYSTEMS OF MANY PARTICLES [Cu. 5 


be 1. On the other hand, there are two different total states, X = 
xi'Px, and X = x;x, for which the single states i and k are 
different; hence the weight of these states is 2. In other words, 
according to classical statistics, the probability of finding the particles 
should be twice as great in different states as in identical states. 
Quantum mechanics looks at it differently. If only symmetric states 
for the gas are permissible (which may be assumed), a total state with 
two different x, cannot be realized in two different ways, but only by 
X = x;Px, + x;%x,, so that a total state of this kind has the 
same weight as a state x,‘?x,. Thus the a priori probability is the 
same whether the two eigenstates composing the total state are differ- 
ent or not. 

We can now generalize these considerations by investigating a 
system made up of n particles which may be assumed to be capable of 
symmetric states only. Then all the states described by the symmetric 
functions o’ have the same a priori probability. In any of the states 
n eigenfunctions x;,, Xz,, * * * are realized all of which need not differ 
one from the other but may agree in groups. Then a certain state can 
be characterized in the simplest way by arraying the single states 
x; in a row and setting the number n; of the particles occupying x; 
below any xz. Of course, the sum of all the n;, terms must equal n. 
Then the scheme is, for example, 


X1 X22 X3 X4 X5 X6° 


eee ae ee ke ke 


It means that the first and fourth eigenstates are unoccupied, whereas 
there are 3, 7, 2, and 1 particles in the second, third, fifth, and sixth 
states respectively. These data are sufficient for determining a cer- 
tain X uniquely. When we apply a permutation to the numbers of 
the second row or exchange them for other numbers of the same sum, 
we obtain the description of another state, and now we may formulate 
the principle of quantum-mechanical statistics in the following way: 
For a given n, any occupation of the x;, states has, a priori, the same 
probability. This principle was enunciated in 1924 by S. N. Bose, 
who saw no other way to derive Planck’s law for black-body radiation 
when the cavity radiation is considered a gas of light quanta. 

Thus the new idea of Bose was to characterize a state X of the system 
only by the numbers 7, n2, - - - of the particles in the single states X1, 
x2, ‘ * + and to ascribe the same a priori probability to any sequence 
M1,N2, * * * ,no matter by how many different individual coordinations 
of particles and states the same sequence nj, no, * * - can be realized. 
In classical statistics the weight of a state ni, ne, - - - was judged 


Sc. 44] BOSE AND FERMI STATISTICS 161 


solely by the number of these individual coordinations. The state 
of the kind n, 2, - * - was determined by numbering the particles 
from 1 up to n, then arraying the numbers in a row and specifying, 
with the help of the numbers in a second row the states in which the 
particles were found. In this way was obtained a certain individual 
occupation of the single states which is of the kind 1, no, «+ « if in 
the second row the number 1 occurred n, times, the number 2 nz2 times, 
and so on. By applying all possible permutations to the numbers of 
the second row, all individual coordinations which correspond to the 
same sequence 71, Ne, - * * were obtained. The number of different 
permutations is n!/n,!neq! - + - , and this number was considered the 
prebebility W with which nature realizes a state of the kind n, no, 

: For a given n, W could have any value between 1 (valid for 
the case where all n; vanish except one, which equals n) and n!, if 
no n; is greater than 1. According to Bose, any occupation m1, 
m2, ‘ + * has the same probability. 

Instead of the occupation of the eigenstates, let us consider. now 
that of the energy levels Z;, meaning by HE; the energies of a single 
particle. In the applications of the theory, we are always concerned 
with systems that are degenerated, and consequently we shall assume 


r, different eigenvectors, x;’, x1’, x1/”, - °° , x1”, all belonging to 
E,, re vectors, x2’, xo’, x2/"", - - - belonging 6 z,, aa soon. Thena 
certain eigenstate X of the gas can be described by the scheme 

Ey E» E; 
xy’x," x7) Xo'xe"” x2"? x;'x,"’ nae x, 7? (226) 
ny'ny" Rear nyt? ne'N2"" ny"? ni'ni!! we n*? 

ny ne Ny 
where n; represents the sums n; = n,’ + nj’ + °° -* +n;%?. On 


examining the first and last rows we see from the scheme that, in 
the X state considered, n; particles have the energy 1, no the energy 
E2, and so forth. The state, however, is not yet defined by these 
numbers. This can be done only by means of the detailed information 
given by the other two rows in which the n, are assigned to the different 
eigenstates of the same energy #;. Thus for a given ni, n2,°*°*, 
that is, for a given occupation of the energy levels, there exist as many 
different states X = o’ of a gas as there are different ways of allotting 
the n; terms, so that, since all the X states have the same a priori 
probability, the number Z of allotments must be considered the proba- 
bility of the occupation n;, no, ** +. Thus, if, on the basis of Bose’s 
statistics, we wish to find the most probable energy distribution of a 
gas, the first thing we must do is to evaluate the number Z. This can 


162 SYSTEMS OF MANY PARTICLES [Cu. 5 


be done as follows: When 7; is resolved into n,’, ni”, «+ + , nj", we 
say that the resolution is of the kind No, Ni, - + - , Nn, if among the 
ni’, n”’ + + + terms the number 0 occurs No times, the number 1 N 1 
times, until finally the number; occurs N,, times. Itshould be pointed 
out that there isnon,;” > n,,and also that N,,,can only beQorl. Now 
the question arises, in how many ways can n, be resolved in a given 
manner, No, Ni +++, Nn,? It can be seen easily that there are 
ri!/No!Ni! - + + Nn,! ways, because, if n,’, n;’, - + + , n° is of the 
kind No, Ni, -- +, Nn, the manner of resolution does not change 
when in (226) all possible permutations are applied to the tee 1"; 

,n;? terms. But an exchange of those particular n;“) the value 
of which is 0 (there are No of such) is without effect, and the same holds 


for those n;” which equal 1, 2, +--+. Thus r;! is to be divided by 
NolNi! ++ * No! 
The number Z,, of all the possible allotments n,’nj’ - - - n;% of 


n; is given by the sum, 


r;! 
tu * » No!Ni! - - - Na! 


in which all resolutions of n; into NO0+ Nil +-:-- +N nN are 
to be taken into account. If r; is a very great number, the expression 
for the sum can be simplified. In this case the largest fraction, 
(r;!/No!N,! - + - Na,!) that occurs in the sum exceeds the others to 
such an extent that we may neglect them and obtain 


r;! 
a= No!Ni! + +: Nag! 
in which the values No - + + Na(NoO+ Nil +N2+---> Nan = 
n;) are to be taken for which the fraction becomes a maximum. If 
we evaluate Z,, in this way for every n,;, the product of all the Znx 18 


r;! 
: No IN! wiswas Nah 


and it gives the number of entirely different ways of realizing the occu- 
pations nj, m2, - - + of the energy levels. In other words, Z is the 
probability of the energy distribution nj, no, - : 

Thus, for a given number n of particles and a given total energy £, 
we shall have to proceed in the following way: We have to determine 
those numbers, N9°N,® - - - Na; (¢=1,2, - + + ) by which, under 


Sc. 44] BOSE AND FERMI STATISTICS 163 


the conditions 
n= yn: = yan +2N, 4 +++ nNa,) 


and 


E= > mB: he > BN + 2nN,“ ee niNn,™) 


the probability 
r;! 
is fs No'!Ni @y... Nt 


is made a maximum. From the N;“ numbers we then find the occu- 


pation numbers of the energy levels EZ; to be 
ny = Ni 4+ 2NQ® + +++ + 2g,” 


The foregoing considerations are easily adapted to the case of a gas 
that exists in antisymmetric states only. Then the states of equal 
probability are those which are described by the antisymmetric eigen- 
vectors X = o’’. In any of the o” functions, n different x; are real- 
ized, so that (226) may be applied again, the difference being, however, 
that for n,'n,’’ - + + only the values 0 and 1 are admitted, since a 
multiple occupation of the x; states is excluded. On the other hand, 
the n; terms may assume any values not greater than r;. Owing to the 


limitation of ni'n,!" - + + to values of 0 and 1, it follows that only 
No and N,™ can be different from zero, so that the probability Z of 
the occupation nyn2 * - - may now be written 

a= ow Raha (Ni OF (227) 


In order to find the most probable energy distribution of the gas we 
must make (227) a maximum under the conditions 


n= > = du° 
Be >, nibs te > iH, 


PROBLEMS 


1. Show that, owing to the exclusion principle, the states ns, np, nd, and nf of an 
atom (where n is the principal quantum number denoted in Chapter 2 by n + 1 
+ 1, and the letters s, p, d, and f mean! = 0, 1, 2, 3) can be occupied by no more 
than 2, 6, 10, and 14 electrons. 


164 SYSTEMS OF MANY PARTICLES (Cu. 5 


2. Use this result for the explanation of the periodic system. 

3. Show that the Bose statistics, when applied to a cavity radiation, leads to 
Planck’s radiation law. Begin by dividing the momentum space of the photon 
into cells At An Ag, a procedure suggested by the uncertainty relations. Apply the 
method of Section 44, and find the maximum of Z under the conditions ZH = con- 


stant, and > ny =r;. (It would be better to use log, Z instead of Z.) Finally, 
k 


use Boltzmann’s relation for the entropy S = k log, Z. The result of the calcula- 
tion is 
8rv®h 1 
1 Bn yo 3 V par ee Y 


4. In the same way show that, for a gas that satisfies the principle of antisym- 
metry, the number of particles with an energy between « and e + de is given by 


4 2 ame de 
As eet ty? 


6 
RELATIVISTIC WAVE EQUATIONS 


45. Particles with Spin 14. Dirac’s Equation. The wave 
equation (27), introduced by Schroedinger, does not satisfy the require- 
ment for relativistic invariance, as is evident from its asymmetry in 
the coordinates zyz and the time ¢t. This asymmetry originates in the 
application of the non-relativistic expression p?/2m = —(h?/2m)V? 
for the kinetic energy of a particle. In order to obtain an invariant 
equation we must start out with the relativistic equation 


H =cV mc? + p’ (228) 


for the energy of a particle on which no force acts, and solve the prob- 
lem of how to interpret the relation in terms of wave mechanics. 
If the wave function ¥(xyzt) is used again to represent the state of the 
particle in the sense of the transformation theory, y*y must have the 
significance of a probability density; because ¥(xyz), being the pro- 
jection of the vector x on the principal axis xyz, should, by the 
square of its magnitude, determine the probability of the particle 
being found at the point xyz. Now, in relativistic theory, the density 
is not a scalar quantity but a part of a four-vector which satisfies a 
continuity equation. Thus the problem would be to find a function 
(i) that corresponds to an invariant differential equation, and (ii) 
from which a four-vector can be formed which will satisfy a continuity 
equation and which has a temporal component of the form y*y. As 
Dirac has shown, these requirements can be complied with when the 
problem deals with particles with spin 4h. But the attempt to 
establish a relativistic wave equation for particles with spin 0 or hf suc- 
ceeds only if we drop the condition of a probability density which is 
always positive. Under no circumstance is it then possible to interpret 
wv*y as a density. This means that the function y may no longer be 
considered a vector, and accordingly the wave equation has to be under- 
stood in a sense far different from the non-relativistic Schroedinger 
equation. It refers no longer to a particle but rather describes a 
wave motion having no connection with the idea of a particle, there- 
fore requiring a new interpretation. It will turn out that the equa- 
165 


166 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


tion does allow an interpretation in which the idea of particles can be 
used, but only after it has been quantized. This can be accomplished 
with the aid of the method already adopted in Section 43. 

A systematic investigation{ shows that there exists a relativistic 
wave equation for particles with any given spin. We shall, however, 
confine ourselves to the spins 0, 14, and 1, which are of special impor- 
tance in the quantum mechanics of wave fields. Historically the first 
relativistic equation was that introduced by Dirac. We prefer to 
begin with that equation since it is still closely connected with the 
plan of the non-relativistic theory. 

As before, we consider the wave function ¥(zyzé) a description of a 
vector x which changes with time according to (h/i)dx/dt = Hx, 
where H is the energy operator of the particle considered. When we 
resolve x into its components ¥(xyz), relative to the principal system K 
of the coordinates xyz, we obtain the wave equation 


>> = Ay (229) 


First let us consider a particle subject to no force and attempt to make 
(229) invariant by taking the relativistic expression (228) for H; to 
change the expression into an operator we again substitute —(h/i)(0/ 
dx) for pz and therefore —A°V* for p*. Since a radical then appears 
on the right-hand side of the expression, Dirac decided to remove this 
inconvenient term by setting V mc? + p* = aypz + aspy + asp, + 
Bme. The quantities a1, a2, a3, 8 then must be determined in such a 
manner that when the right-hand side is squared we obtain mc? + p?. 
We cannot achieve this by ordinary numbers, but we succeed if we 
consider a,a2a38 to be matrices the discussion of which we shall defer 
for the present, assuming only that they are commutative with the 
momentum components. On the other hand, they are not supposed 
to be commutative with one another, and hence when squaring we must 
take the factors in the proper order. Then we get 


(arp2 + aopy + ++ + )? = 2p,” + ae’p,y” + a32p," + B*mc? 
te (aja a 201) DzPy sl crete (a38 ae Ba;)pzme ote ath Ait 


Now we see that the right-hand side becomes mc? + p? if we choose 
a,a2038 (for 8B the notation a, may be used for the moment) in 
such a way that the following relations hold: 


1 for ¢-=—k 
aa, + apa; = 25% (a yi ve : wi ‘ 


TP. A. M. Dirac, Proc. Roy. Soc. London, A165, 447 (1936). 


(230) 


Src. 45] PARTICLES WITH SPIN 4% 167 
for then we shall have 


ay? = ao? = oe e wl] 
and 
ayag + aay = 0 a8 + Bay = 0, etc. 
The only question then is whether we can really find four quantities 


that satisfy (230). That we can is proved easily. First we define 
four matrices with two rows by 


fl eB ead 6 


(231) 
From these we form two four-row matrices, 


oft] 9°[- ty om 


Then, using rule (120), we can verify that equations (230) are fulfilled. 
Now we attempt to base the theory on a Hamiltonian operator of the 
form 


3 3 
h Ci) 
H =e) acpe+ mea = — "oY ace + me's (233) 
i=1 i= 
(we shall use the notation 2:72x3 for xyz). We then obtain the wave 


equation 
3 


~¢ ) gs me*By (234) 


=1 


elz 


which is of the first order in both ¢ and the coordinates 72%3. 

The next question is how the four-row matrices (232) which occur 
in the equation are to be interpreted. Evidently their significance 
can only be that, besides the coordinates x1%223 a fourth observable p, 
called the spin coordinate, is to be associated with the particle; in a 
way that is still to be explained, it describes the internal state of the 
particle and can assume four values only. In order to represent this 
fourth coordinate in the geometric formalism of the theory, we have 
to amplify the coordinate system K of the Hilbert space. Up to this 
point any axis of K belonged to certain values of zyz. At the present 
stage, however, the values of x, y, and z do not define the state of the 
particle uniquely but require a specification of the internal state. This 
forces us to change any xyz axis of K into a four-dimensional sub-space, 
in which, by four orthogonal axes, a coordinate system K, is scaffolded. 


168 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


It is this sub-space to which the matrices a; and 8 refer. (They depend, 
of course, on the choice of K, and undergo a transformation when we 
exchange K, for another system K,’.) The four axes of K, define 
four unit vectors x1x9x3x4 corresponding to four states all of which 
belong to the same values of xyz but differ, as we shall see, from one 
another by the spin and the sign of the energy. This means that, 
because of the spin coordinate p, instead of one wave function y(«yzt), 
four functions must now be introduced which may be distinguished 
from one another by an index p; thus a vector x representing a certain 
state of the particle needs four functions 1, Yo, ¥3, #4 for its descrip- 
tion. The meaning of the y, functions is this: |¥,|? determines the 
probability of finding the particle, by a simultaneous measurement of 
position and spin, at the point xyz in an internal state corresponding to 
the number p = 1. By means of an operator A, x(Wiops¥4) trans- 
forms into another vector x’(W;/po'y3'p4'), the linear connection of x 
and x’ = Ax being effected by equations of the kind 


x a ) A poWo (235) 


c=1 


The A,, form a matrix and comprise certain differential operators 
which can act on the y, functions. An example of such is the operator 
H defined by (233) the elements of which are 


h Rigi’) 
Hy = "ey <) — me2, . 
pi : Xp =" Bp 


t 


For dy,/dt from (234), we obtain 


Me aaa >: ; A yoo a y (-c Ye o a pue « me*Be) 


Multiplying by y,* and summating over all p gives 


Sorte S(t faces) 


Passing to the complex conjugate equation, we have, since 


Ga = Agy and it = ae 


Suite D(a 


When the indices p and o are interchanged in this equation (which only 


oo 
ma h me ViBonte*) 


Sxc. 45] PARTICLES WITH SPIN % 169 


means another notation o* the indices), the summation of the two 
equations gives 


wD em —0 3a), Wen (236) 


t ad 


a relation that has the form of the continuity equation 


PD 
aoe div i (237) 
so that 
Pe > vee 
Pp 


‘may be interpreted as the probability density and 
i= c) Vp *ApeWo = c(y, ay) (238) 
po 


as a probability current. The arrow placed over a means that the 
three quantities, a1, a2, a3, are taken together as a matrix vector; thus 


(y, ob) is to be understood as a vector with the components (y, ay). 
The product is to be read in the sense of (113). 

We have already remarked that a; and 6 must, somehow, define 
the internal state of the particle. In fact, as we shall see, the phe- 
nomenon of the spin can be derived from them. Unexpectedly, how- 
ever, they simultaneously determine the velocity of the particle. 
According to (163), the equation dA/dt = (i/h)(AH — HA) holds for 
any observable of a system, where A is the matrix associated with the 
observable. When we apply this relation to one of the z; coordinates, 
taking into account that z; and a; are commutative, from (233) we 


obtain 
dx; 0 0 
U = —Ca; (2: ax, a aa; n) = Ca; (239) 


Thus we have not, as in non-relativistic theory, dx;/dt = p;/m; 
rather dz,;/dt is a matrix and its measurement should, therefore, 
always furnish one of the eigenvalues of the matrix. These eigen- 
values, evaluated according to Section 25 for all the a;, turn out to be 
+1 and —1, so that, according to Dirac’s theory, an electron should 
be expected to be moving always with the velocity ¢ of light. The 
explanation of this seemingly absurd result is, as Schroedinger was 
able to show, that on the progressing motion of the electron (which 


170 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


alone can be measured) there is superposed an oscillating motion con- 
sisting of very rapid vibrations which the electron performs about its 
path of propagation. The value ca; refers to the total motion and 
does not, therefore, measure the velocity of advance of the electron, 
which velocity may have any value less than c. 

Let us now investigate the phenomenon of spin. This will be 
explained in the following way: In non-relativistic theory the operator 
of the angular momentum rx p = (—h/i) rx grad is commutative 
with H, so that the principal axes of H and rx p coincide, the conse- 
quence being that in a stationary state the angular momentum as 
well as the energy remains constant. In Dirac’s theory this commu- 
tability of H and rx p disappears, for we have 


é é 0 C) 0 
= “a pl ae eae 
: |(« 0x dihae. oy —_ a Mu 0z , =) 
te) te) Ce) te) Ce) 
=A (v5. - ‘az + a8 5, st «=) | 


0 te) 
—h'e (a2 ae fame | e) (240) 


H(rxp), — (r« p):H 


ll 


This means that in a stationary state rx p does not remain constant. 
The reason for this is that now r x p alone does not represent the total 
angular momentum but has to be taken together with a second term 
which originates in the internal motion of the particle. In order to 
determine this supplementary angular momentum we consider the 
matrices 


Cy. = 1agag 73. = 10304 3 = 1ajag (241) 


and investigate the operator (h/2)o foi denotes a vector with the com- 


ponents o;). (h/2)o is not commutative with H either, for on taking 
(230) into account we find 


I 

| 
| 

° 


h h? r) t) a 
9 (Har — 01H) = (ag, tary +95) asa 


0 fs) 
h2 sap eR I 
(as dz =) 


Comparing this result with (240), we see that the quantity rx pt+ 


Sxc. 45] PARTICLES WITH SPIN 171 


(h/2)o is commutative with H and therefore must be taken as the 


total angular momentum of the particle. The part (h/2)o = 8 is 
due to the spin. For the magnitude of s we obtain, from (232), 


s* = v (o;” + oe” + 3°) = 3 7 
4 4 


Thus a measurement of s” furnishes the value 34h? in any state of the 
particle. On the other hand, the components of s, because Ho; — 
o;H # 0, have no fixed values in any stationary state; this holds only 
for rxp +s. However, for a sufficiently slow motion, we can, by 
neglecting terms of the order of magnitude v?/c, simplify the expres- 
sion for the energy to mc” + p?/2m. When we substitute this expres- 
sion for H in Ho; — o;H (H is the operator of non-relativistic theory), 
the difference becomes zero. o;thencommuteswithH. However, this 
does not mean that all the components of s then take on fixed values 
inastationary state. For theo; are not commutative with one another 
and therefore cannot be measured simultaneously. Always only one 
component of s can be measured, for example, s, = (h/2)o3. From 
(232), for ¢3 = ta,a2, we obtain the diagonal matrix 


re) “oe © 
; qeca yr igng 
Tay Qecopaig 

erg! a 1 


This means that, in the four states the vectors of which have the 
directions of the principal axes of the spin space, the observable s, is 
twice equal to +h/2 and twice equal to —h/2. Thus Dirac’s theory 
furnishes twice as many states as are required by the spin, from which 
it may be inferred that, besides the spin, there still must be another 
observable with two possible values and consequently the two axes 
belonging to the same spin differ with respect to the values of this 
observable. It turns out that this observable is to be found from the 
sign of the energy. 'To prove this we assume that the representation of 
the a; terms and £ is chosen in such a way that all a; are real, whereas 
8 is imaginary. In order to attain such a representation we must 
exchange the coordinate system of the spin space for another one. A 
matrix A then becomes A’ = SAS~!. The desired effect can be 
achieved, for example, by means of the transformation matrix S = 
(i/ V2)ay05(02 + 8). 


Now let us assume that ¥,(o = 1, 2, 3, 4) represents a stationary 


172 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


state of the positive energy Z. Then we have Hy = Ey and thus, 
because of (233), 


h 7) We 4 
By, = Fe), ()) an? SE — jmetbat) 


Passing to the complex conjugate equation we obtain, since now 


Opa* = Opa, Boo* = —Beoo; 


* ; 
—Ey,* = — me > ‘o> cus we om * meine) 


This equation; however, implies that y,*(p = 1, 2, 3, 4) represents a 
state of negative energy —E. Thus any state of positive energy is 
associated with the existence of a state the energy of which is negative. 
The states of negative energy arise from the fact that in the relativistic 
expression (228) for energy the root may be taken with the negative 
as well as the positive sign; therefore they exist in classical theory. 
But they do not play any part in that theory because the continuous 
manner in which the states change there with time makes their realiza- 
tion impossible, and so they may be disregarded as being meaningless. 
In quantum mechanics, however, these states are not meaningless, for, 
according to the dynamical law (161), the passage from a state of 
positive energy into one of negative energy involves no difficulty what- 
ever. This can be seen immediately from equation (184’) with the 
help of which the probability of the passage induced by a perturbation 
acting on the system can be determined. The probability is propor- 
tional to the square of |H ant) where k denotes a given initial state of 
positive energy and n a final state of negative energy. Therefore we 
cannot simply ignore the states of negative energy without destroying 
the theory, for by cancelling the states we reduce the four components 
of w to two and thus deprive the theory of its invariance. For a long 
time the theory was at a loss to explain the states, for no way could be 
found by which to give them a reasonable interpretation. In the 
next section the consequent explanation of them on the basis of Dirac’s 
hole theory will be discussed. 

At this point we shall not go into a rigorous proof of the relativistic 
invariance of Dirac’s equation. The simplest way would be to inter- 
pret the four quantities ~1, ¥2, v3, ¥4 as two pairs of spinors and apply 
the methods of spinor calculus, spinor calculus being merely an exten- 
sion of tensor calculus. At once it would be seen that the four quanti- 
ties, defined by (237) and (238), form a four-vector; this is necessary 
if (236) is to be Lorentz-invariant. 


Sxc. 46] A PARTICLE IN AN ELECTROMAGNETIC FIELD 173 


46. A Particle in an Electromagnetic Field. Dirac’s Hole 
Theory. Thus far we have investigated only a particle subject to 
no force. Now let us consider a particle on which an electromagnetic 
field EH is acting, the field deriving from a scalar potential ¢ and a 
vector potential A by the equations 

1dA 
E = —grad ¢ —- — H = curlA 
c dt 
The classical equation (228) for the Hamiltonian H then must be 
amplified by terms that contain ¢ and A. It can be shown that H 


then becomes 
2 
H = 09 teaim? + (p—2A) (242) 


To prove this we write down the canonical equations 


é 
x = 24.) 
dx oH e(p c 
a Vv, = Pe = fre caves Nk 
Px mc? + (p nie Pc A) 
(243) 
e edAz 
dp: aH ab e(. -£4.)é "er 
+ ee ee se ee 
dt Ox Ox bus é.. \2 
mc +(P pte 
a The Bala icnga: 
= 0 +8 (0, tye to, 


Ox 


If dA/dt signifies the change of the Blut A of the moving particle, 
we have 

erate Me, |p ode 
Gins edit az." 
where 0A,,/d¢ is the time rate of odhigs of “p at a fixed point. Multi- 
plying this equation by —e/c and nity it to the pz equation, we 
obtain 


= 08 224s 1 ¢/, (24 _ 24s ~ », (24: — 24) 
da c & cl *\ az oy *\ dz da 


=e (. +<[vx Hl.) (244) 


174 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


On the other hand, from (243) and the two corresponding y and z 
equations, squaring and adding, we get 


os e A aa my 
c 1 yp 
a 
Thus (244) becomes 
d my ( 1 ) 
ee PS BE ad E Cr x H 
dtV/1 — v?/c? ‘ bai Se 


This result shows that (242) is correct. Thus for an electromagnetic 
field ¢, A the operator —(h/7)(0/d2;) of the wave equation must be 
replaced by —(h/i)(0/da;) — (e/c)A;, and the operator (h/7)(d/dt) 
by (h/i)(/at) — ed. Then we obtain 


ean S Dongs) 
ne eb, >» > a ae +2 As) = mode | ve (245) 


a 
Although the explanation of spin as a relativity effect is an undoubted 
success for Dirae’s theory, it had to be accepted on the assumption that 
there are states of negative energy. In these states the electron should 
have energy between —mc? and — » and therefore should behave like 
a body of negative mass, that is, when a force acts on it it should move 
in a direction opposite to the force. Because an electron of this sort 
was never observed, there arose a clash between theory and experience 
which could be remedied only with the help of a supplementary princi- 
ple excluding the states of negative energy. For this purpose Dirac 
suggested the assumption that, when no field is acting, all the states of 
negative energy are occupied. Then, because of the exclusion principle 
that holds for electrons, no electron is able to assume a state of negative 
energy. It is further assumed that electrons occupying negative 
energy states are unable to produce an external field. What we 
observe are always only the deviations of nature from the kind of 
charge distribution wherein all states of negative energy are occupied 
and all states of positive energy are unoccupied. In other words, we 
always measure the resulting charge if from all the actually existing 
charges those of the previously defined “‘zero state” are subtracted. 
The same is assumed to hold for energy and momentum also. For 
example, if a state of negative energy is unoccupied, the hole has the 
same effect as a particle of positive charge, positive energy, and 
momentum of opposite direction. Since the mass as well as the energy 


Sxc. 46] A PARTICLE IN AN ELECTROMAGNETIC FIELD 175 


becomes positive, the hole behaves like a positron, which differs from 
an electron in the sign of its charge only. Regarding its production, 
we have to imagine that an electromagnetic field, that of a photon for 
example, is able to have an effect on the electrons of negative energy. 
The effect may be that the electron passes from a state of negative to 
one of positive energy. Then the hole, representing a positron, is 
created by the passage and, in addition, an ordinary electron, that is, 
a pair of electrons, will appear. As the energy of the electron is 
increased by the passage from a value less than —mc? to one greater 
than +c’, then for the production of a pair an energy of at least 
2me? is required. Conversely, an energy greater than or equal to 
2me? is liberated when a pair is annihilated by a process whereby an 
electron drops back into a state of negative energy, in this way filling 
the hole so that both electrons disappear again. Experience, in fact, 
shows that the creation of a pair takes place only if an amount of 
energy hy equal at least to 2mc? is available and that, correspondingly, 
the annihilation of a pair is accompanied by the emission of light the 
frequency of which is equal to or greater than 2mc”/h. 

We are confronted with certain difficulties when we attempt to give 
the hole theory a consistent formulation. Without going into these 
difficulties, we shall merely point out that when we adopt the assump- 
tions of that theory we are no longer in a position to consider Dirac’s 
equation as the formulation of a one-body problem. If the conception 
of the hole theory is correct, we are not permitted to confine the con- 
siderations to only one electron, since only the deviations from a zero 
state, as characterized by the occupation of all negative energy states, 
are observable. This means that, in any case, the negative energy 
electrons must be considered. As a consequence, equation (237) 
no longer may be interpreted as a particle density, because, owing to 
the possibility of pair creations, the number of observable particles 
in a sufficiently rapidly varying field does not remain constant and 
hence there can be no defined particle density. Therefore we are 
obliged to interpret p = (y, y) as a charge density which has a defined 
value since it is not changed by the creation or annihilation of a pair. 

Thus, from a rigorous standpoint, we cannot establish a relationship 
between Dirac’s theory and the concept of a particle. Such becomes 
possible only after the theory has been quantized (cf. Chapter 7). 
Dirac’s equation corresponds to a one-body problem only when there 
is no field permitting the production of pairs; then y may be taken as 
the vector representing the state of a single particle and (y, ) as the 
probability of finding the particle at a given point. Then the sea 
of electrons has only the effect that the electromagnetic field, 


176 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


by virtual creations and annihilations of pairs, takes on properties 
which contradict those of classical electrodynamics and which can be 
described only by non-linear equations, according to which it is possi- 
ble, for example, for light to be scattered by light. 

47. Particles with Spin 0. The Equation of Klein and Gor- 
don. The simplest wave equation corresponding to the relativistic 
equation (228) for energy, is obtained if, as in Schroedinger’s theory, 
a scalar function y is taken, and if, in the relation H* = c*m? + cp’, 
H = (h/i)(d/dt) and p = — (h/z) grad, giving 


ee aT A mc (= hv*y 


If we let O represent —(1/c?)(d*/dt*) + V’, we obtain 


mc? 


Oy = FE y (246) 


This second-order equation can be easily resolved into two equations 
of the first order. From this point on we shall use the variable a4 = 
ict instead of t. The advantage of this is that the metric fundamental 
form for the four-dimensional world becomes 217 + a2” + 23” + 2x4”, 
thus making it unnecessary to distinguish between covariant and con- 
travariant tensors (cf. Section 48). By taking the four-dimensional 
gradient we can then derive from y a four-vector with components 
Oy /dre. (a = 1, 2, 3, 4). By way of definition we put this vector 
equal to another four-vector x multiplied by x, where x = mc/h; thus 


i ae ey (247) 


The divergence of x is a scalar, and (246) may be written in the form 


eee (247') 
OX 
(We follow the usual convention in considering an expression in which 
two identical indices occur as a sum and @ assumes all values from | to 
4.) When an electromagnetic field ¢, A acts on the particle, according 
to Section 46 the operators (h/7)(0/dt) and —(h/7)(0/dxz;) are to be 
changed to (h/i)(d/dt) — ed and —(h/i)(d/dx;) — (e/c)A; respec- 
tively. We can simplify this directive by combining A and ¢ to a 
four-vector ® with @; = A; (i = 1, 2,3) and @, =72¢. Then, for any 
index a, the operator (h/7)(0/dxq) is to be changed to (h/7)(0/d%) + 


Sxc. 47] PARTICLES WITH SPIN 0 177 
(e/c)®_, and (246) becomes 


ha e 4 ns 22 
22 +2.) "planes at iat 


The problem now is to find the meaning of the wave function y. For 
this purpose we must try to form a four-vector sq from w which, because 
of (248), satisfies the continuity equation s,/dx. = 0, and the time 
component of which may therefore be viewed as representing a density. 
Such a vector is found in 


bie ry Ye Ae a on 

Sa = a (v a y an. Day *y (249) 
where a denotes a real factor which remains to be determined. It is 
clear at once that the expression corresponds to a four-vector because 
the first two terms transform like a gradient and the third one like ©. 
Furthermore it is seen readily that the divergence 0sq/0%_ vanishes 
because of (248). To show this, all we must do is to multiply the 


equation 
h a e y 
- 6,* * = 922% 
> (- i O%a* ue v wih ii 


by w and subtract it from (248), which has been multiplied by y*. 
The equation 0s./d%. = Oresults. Conforming to the requirements of 
its physical interpretation, the vector (249) is real in its space compo- 
nents but the fourth component is imaginary. As the latter is to be 
considered as icp (this is evident from the equation 0s./d1%_ = 0, which 
then becomes dp/dt = —div s, s being a vector with components 
818283), p is given by 


aye rae Soiky +)-3 — = oy (249') 


As the wave equation is of the second order, the derivative dy /dt may 
be chosen arbitrarily for a given instant so that (249’) may take on 
both positive and negative values. Thus p cannot signify a particle 
density, for which negative values are without meaning, but must 
be considered a charge density. Hence there is in the scalar wave 
field no probability of finding the particle when its position is measured 
at a given point, and as a result we can no longer look upon y as a 
vector representing the state of a particle. This means that (248) 
requires an interpretation quite different from that of the non-relativis- 
tic equation of Schroedinger. However, before going into this 


178 RELATIVISTIC WAVE EQUATIONS (Cu. 6 


matter, let us determine first the expressions for the energy and momen- 
tum of the field which result from (248). We shall confine ourselves 
to the case ¢ = A = 0, so that w satisfies equation (246). The equa- 
tion corresponds to a certain Lagrangian function L, from which it can 
be derived by 6 | Ldt=0. If we set L = / L dv (where L denotes 


the Lagrangian per unit volume), L (aside from an arbitrary factor) is 
found to be 


L=-— (wv bi es =) = —K(*Y + xa*xa) (250) 


K OLe OLa 


in which L is expressed in terms of y and y*, which, being functions of 
xyzt, determine the momentary state of the field and thus take on the 
significance of generalized coordinates. In order to derive the field 


equations, y and ~* must be chosen in such a way that | L dt takes on 


a maximum or minimum value, that is, / 6L dt must be equal to zero 


for any infinitesimal variation 6y and 6y* which vanishes at the bound- 
aries of the field at the given time. This is fulfilled if y satisfies 
(246), for on integrating by parts we get 


sf va - Jf [nove + vem 


eR (MEME ye ERY) ayy 


KAN O@e 0% OXLe OX 
1 d*y 
“JIN - 288)" 
24% 
+ (0 — YS t) iy | aed iG) 
K OLe 


The momenta, which are canonically conjugate to y and y* and which 
are designated by z and x*, are found from (250) to be 


aL 1 oy* 1 dy 


Pe ed ek Aaa ae ey a gall eb cote 


a (ap/at) xc? ot xe? at 
As is known, the energy H of the system is obtained from L = L(q@) 
by the relation H = > pq —L. In the case at hand we find 


ll 


H = [{ Ha 
(251) 
3 
tachi PR Ts clyitleay® Of 1 dy* oy * 
~ xe at at i= kK O24 O24 pera. 


i=l 


Sxc. 47] PARTICLES WITH SPIN 0 179 


These expressions represent the energy density of the field. To 
determine the momentum density g we have to develop (251) to a 
symmetric tensor 7 of the second order in which (251) occurs as 
the term T'44. This tensor 7, by T;,4/ic, gives the components g;, of 
the momentum density. Now, from y and y*, we can form two sym- 
metric tensors of the second order. One of these can be derived from 
the four-vectors grad y and grad * and is given by 


_ Ov* oy , Oy* oy 
Om OLn | OLn Om 


The other is the product of the scalar y*y and the tensor 5.,, and is 
expressed by 


Tn’ 


Tan’ ay y*y Omn 


The scalar function L may be used instead of y*y for the definition of 
T’’. Then the combination of T’ and T’’ which fulfills the required 


condition is 
_ 1 fay* ap , ay ) ; 
Tne = K (¥ OLn t OL OLm L bmn 52) 


For the components of g we obtain, from the above equation, 


renee 
Ox, Ot Ot Ox, 


When a field of force ¢,A is acting in (252) and (253), we must, as in 
the passage from (246) to (248), change dy/dz, and dy*/dz, into 
Op /dxz + (ie/he)by and dy*/dax, — (ie/hc),p* respectively. 

A scalar field can be correlated only to those particles which have a 
spin zero. Since the spin is to be considered an internal property of 
the particles and independent of the xyz coordinates, it can be repre- 
sented only by means of a wave function which, in addition to depend- 
ing on the coordinates xyz, depends on still another variable p. This 
means that ¥ must consist of several components just as did the w of 
Dirac. 

It is important to realize the physical significance of the wave equa- 
tion (248). It has been emphasized already that this equation no 
longer permits an interpretation in the sense of non-relativistic quan- 
tum mechanics. As long as it is not quantized, it has no relation what- 
ever with particles but controls the behavior of a field that is 
not purely symbolic but actually represents a physical reality. This 
is clear from the fact that, according to (249), (251), and (253), the field 
possesses charge, energy, and momentum and thus it can be measured 


1 
ge = ~The = (253) 


ic ck 


180 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


relative to these observables. It is only a question then as to whether 
wave fields corresponding to the assumed equation really exist, a 
question that must be answered in the affirmative. All our experiences 
dealing with elementary particles lead to the conclusion that the 
particles must be related in some way to a wave field, since in certain 
experiments they conceal their corpuscular nature and behave as 
wave motions. Accordingly it is reasonable to correlate a certain wave 
motion to any sort of particles and to suppose that this motion is 
defined by one of the relativistic equations under discussion in this 
chapter. 

However, this picture of a wave motion represents only one aspect of 
actual experience because there are other experiments in which no 
undulatory properties are detected, the particles displaying a strictly 
corpuscular nature. Such a situation compels the theory to associate 
the wave field with some symbolic mechanism based on the commuta- 
tion relations and operating in such a way that the wave picture and 
the particle picture disappear alternately, the consequence being that 
the field assumes the strikingly ambiguous character which is observed 
in nature. By methods to be developed in the next chapter, we 
shall see that the mechanism functions in such a way that the field is 
quantized. 

48. Digression on Tensor Calculus. Pseudoscalar Wave 
Field. In our further investigations the methods of tensor calculus 
will be of great help. For this reason, we shall outline these methods 
briefly. We fix our attention on a four-dimensional space in which 
an orthogonal coordinate system K may be established. A contra- 
variant tensor of the first order is defined by a set of four quantities, 
a}, a”, a®, a‘, which may be either constants or functions of 71, %2, £3, 4, 
and which, when K is replaced by another coordinate system K’, 
transform in the same way as the components of a four-vector. Thus 
the components a” of the tensor, when referred to K’, are linear 
functions of a‘, that is, a” = aa", where a“ represents the coefficients 
of the transformation K— K’. A tensor @;d2a3a4 is called covariant 
if, when K is changed to K’, the a,’ = a;,a, terms transform in such a 
way that the sum a,a‘ changes to a,’a”. Since a;/a” = aj,.0'"a,a”, 
the transformation coefficients a; must satisfy the condition a;,0°" = 
6,” = 1 or 0, depending on whether m = k or m#k. We obtain 
aa” = 6,”"a,a" = ana”. 

A contravariant tensor of the second order is defined as a set of six- 
teen quantities a**, which transform like the products a‘b”, where a‘ and 
b* are the components of two tensors of the first order. Thus a™ = 


ai kngm” which is to be summed up over m and n. Similarly a 


Sxc. 48] DIGRESSION ON TENSOR CALCULUS 181 


covariant tensor of the second order a,, transforms like a;d,; aj,’ = 
Qim&knImn'. It can be inferred from this definition that a,a™ is 
invariant, for we have 


a,,'a**’ = Cim&knOmno? ak a?® a Sm? 6n°Anna?® = Gael” 


A mized tensor of the second order consists of sixteen quantities a;* 


which transform like the products a,b*, so that aj!” = ajma*"an”". It 
follows that a,’ is invariant because 


au 


a; - 


= ind Ga = bn" dn” = Gn 


A contravariant or covariant tensor of the second order is symmetric 
if a = a* or ay, = ay; and antisymmetric if a = —a™ or ay = 
—a,;. These properties are not changed by a transformation, since 


a® _ giMakngmn an aim ykngnm ee; ak 


In a similar way it can be shown that a’ = —a*” when a” = —a"*, 


It is immediately evident how these definitions are to be extended to 
higher orders. The tensor of order 0 is defined as a scalar. A tensor 
of arbitrary order is said to be symmetric or antisymmetric depending 
on whether the exchange of two indices leaves the sign of a component 
the same or reverses it. There are two very useful rules in tensor 
calculus: 


(i) When the components of a tensor of order 7 are multiplied by 
those of a tensor of order k, a tensor of the order 7 + k results. This is 
a direct consequence of the transformation properties of tensors. 

(ii) When, in a mixed tensor, an upper and lower index are set equal 
(then we must summate over the index), we obtain a tensor the order 
of which is reduced by 2. 


To prove this let us consider the tensor a?™. Then a/™ = 
ana Pa™ a?” = SnPa™ a?” = a™a,", this relation implying that 
a,” transforms like a”. This procedure of making two indices equal 
for the purpose of lowering the order of a mixed tensor is called con- 
traction. As an example, we have already had a,‘ as invariant, that 
is, forming a tensor of order 0, which by contraction is derived from 
the tensor a;". 

Throughout our applications of tensor calculus we shall be concerned 
with tensor functions, that is, tensors the components of which are 
functions of 21, 22, 23, 24. We can speak of a tensor field. It isa 
matter of importance that from a given tensor field other fields may 
be derived by means of certain differential operators. For example, 
let us assume that $(x;r2v324) is a scalar function defining a tensor 


182 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


field of order0. Then the increment of ¢ corresponding to the increase 
of 21 °° * 2%, to 7+ dzi,--+, % + da, is given by dd = 
(0¢/dx1) day + +++ + (0¢/dx4) dz. Since its value does not 
depend on the coordinate system, d@ is invariant. Furthermore 
dx, - + - dx, transform like a four-vector and therefore represent a 
contravariant tensor of the first order. From this invariance of dé 
it follows that 0¢/dx, - + - 0¢/da, form a covariant tensor. Thus, 
from a tensor field of order 0, a covariant tensor field of order 1 can 
be derived by multiplying ¢ symbolically by 0/dx;. The result 
evidently is the gradient of ¢. 

The method applies also to tensor fields of higher order. - a‘(a 91324) 
may be supposed to represent a contravariant field of the first order. 
With the help of an arbitrary tensor a; the components of which have 
fixed values, we form the scalar function aj;a‘(v; - + - 24) = @ the 
increment of which isd@ = a;(da*/dx1) dxy + «+ - + a;(da*/dx,4) dx,. 
Since d¢ is invariant, a;(da*/dx;) « + + a;(da*/dx4) must be the com- 
ponents of a covariant tensor of the first order and da‘/da, the 
components of a mixed tensor a,'. By contraction we obtain from 
da‘/dx;, the scalar function da‘/dx;, which is identical with the diver- 
gence of the tensor function. 

The distinction between covariant and contravariant tensors forms 
an essential feature of tensor calculus, since only by the combined 
effect of these two caninvariant quantities be established. A covariant 
tensor can be coordinated to any contravariant tensor and vice versa, 
the two tensors being, in general, different. However, a suitable 
choice of the coordinates +1%2%32%4 makes it possible to achieve equal 
tensors. ‘To do so we take as the fourth coordinate x, = ict, and then 
the invariant expression 2,” + 2” + 23” — c"t? changes to 2,7 + 
ro” + 23” + x4”, from which it follows that the contravariant tensor 
21090324 is associated with a covariant tensor of the same components. 
(On the other hand, x: ver3ct calls for x1, x2, 73, —ct). We shall assume 
that 24 is always ict; then we need not distinguish between covariant 
and contravariant tensors, it being sufficient to characterize tensors 
by lower indices. 

Above all, in quantum mechanics, we are interested in antisymmetric 
tensors: they have the property that any component reverses sign 
when the indices are interchanged; thus a, = —a,; and az. = —aygi. 
In an antisymmetric tensor all components, such as a,; or @,;;, having 
two equal indices must vanish, for these components remain unchanged 
when the two indices are interchanged. Thus in the four-dimensional 
world an antisymmetric tensor of the second order has not sixteen but 
only twelve components. These form six pairs of components having 


Sxc. 48] DIGRESSION ON TENSOR CALCULUS 183 


same magnitude but opposite signs, and thus the tensor is defined 
by the specification of six numbers. In an antisymmetric tensor of 
the third order only those a,;,; components having different 7, k, 1 
are not equal to zero, so that the tensor may be specified by 
four values which belong to de34, @314, @124, @123. These four 
quantities transform like the components a d2a3a4 of a four-vector, 
for the tensor a,,; together with the covariant tensor (which, for 
x4 = ict, is identical with the first one) defines the invariant 6(a234” + 
- + + ++ @y93”) which is the same kind as the invariant ay? + +--+ + 
a,” of a tensor a; - + a4. Thus an antisymmetric tensor of the third 
order behaves like a four-vector. However, there is no complete 
equivalence with a four-vector when the coordinate system is changed 
so that all space axes are inverted but the time axes remain the same. 
Then an ordinary four-vector changes the sign of every space com- 
ponent, leaving the time component unchanged. On the other hand, 
the vector @234431401240123 behaves in just the opposite manner, as, 
for example, a234 transforms like the product 2,272%4. The antisym- 
metric tensor @234431441244123 On this account is called a pseudovector. 

Finally we consider an antisymmetric tensor @;%7m of the fourth 
order, which is evidently the highest order because more than four 
indices cannot differ one from the other. Here we obtain a scalar, for 
now the totality of the components reduces to one member, 1934 = 
—d2134 = *** , and there follows from the invariance of @;41mQiz1m 
that of ai234. Thus an antisymmetric tensor of the fourth order is a 
scalar which, like the pseudovector, is of a peculiar kind, for, whereas 
a true scalar remains unchanged when a spatial reflection is performed 
on the coordinate system, the quantity a1234 reverses its sign. For 
this reason it is called a pseudoscalar. 

There is no difficulty in adapting the equations of the preceding 
section to the case of a pseudoscalar field. When we substitute an 
antisymmetric tensor function xag,s for » instead of (247), we obtain 


OXap 
oh = Kars (254) 


where wWg,s is a pseudovector. Equation (254) is an invariant one 
since we have seen that from any tensor field another field of an order 
reduced by one can be derived by differentiation and contraction. 
In (254) the summation is over the index a, but, as all terms of Xas1s 
vanish except one, then (254) has precisely the same form as (247), 
the only difference being that a pseudoscalar and pseudovector replace 
scalar and vector. 


184 RELATIVISTIC WAVE EQUATIONS (Cu. 6 


By differentiating s,s, a tensor of the fourth order, dWgys/d%a, is 
obtained, which, however, is not antisymmetric, for when a is inter- 
changed with another index, a quite different quantity results. We 
can, however, obtain an antisymmetric tensor by combining four 
tensors of the kind dy,,s/d2%~ in a suitable way, and we obtain the 
equation 


OVars = Ways , Wass — IWasy 
se ce ee fe eet ee Ram Rd 5 
02a dug + Oxy Ba, heat kaa0) 


Examination shows that the expression on the left-hand side is anti- 
symmetric in all indices; therefore it represents a scalar. Equation 
(255) is the counterpart of equation (247’). 

The rest of the theory is in perfect analogy to that which has been 
developed in the preceding section. Except for certain factors, all 
the previous expressions may be adopted. For example, for the 
Lagrangian function we obtain 


1 1 
L=-k (+ Waby *Wapy = 4! xaonitab) (256) 


The factors 1/3! and 1/4! must be inserted because, as in Section 47, 
we have to take the products only once, whereas our index convention 
requires summation over all the permutations of the indices. 

As long as the interaction of the field with the particles is omitted 
from consideration, both the scalar and pseudoscalar theories are 
perfectly equivalent, both describing the same field and differing only 
in the means of representation. This equivalence does not, however, 
apply to the interaction of the field with matter, a problem to be 
treated in Chapter 9. This interaction must be expressed by terms 
which are relativistically invariant, and an essential difference is 
introduced depending on whether the terms are derived from a scalar 
or a pseudoscalar field. 

49. Particles with Spin 1. de Broglie and Proca’s Equation. 
As has been pointed out, the spin requires a resolution of the 
wave function y into several components, for a scalar function is 
incapable of describing the internal state of a particle. In Dirac’s 
theory y was resolved into two pairs of spinors. de Broglie, and later 
Proca, tried to handle the spin in a different way by substituting a 
four-vector ¥, with components 1, ¥2, v3, ¥4. In this way an anti- 
symmetric tensor xas can be derived from y by means of the equations 


pe ae (257) 


Sxc. 49] PARTICLES WITH SPIN 1 185 


where mp is the rest mass of the particles of the field; they appear when 
the field is quantized. These equations are in exact agreement with 
Maxwell’s theory wherein, for a charge-free field, the field quantities 
Xes' = EH are derived from a four-potential (6; = A;, 6, = 7) by 
Xap = ODg/0L_ — OP,/dxs. But, whereas the xas’ quantities satisfy 
the equations dx29’/d%_. = 0, it is now assumed that 


OXas 
—— 258 
oa Kp (258) 
This equation is invariant because, from the operation on the left- 
hand side, a four-vector results. Owing to the equality xas = —x¢a, 
from (258) with x ~ 0 we obtain 
f) 1 3? 
Ft ae nD (259) 


OxXg K Of 0x8 


On substituting in (258) for xag its value from (257), we obtain 


Sethe. wk Mau OMe cris, 
xg? Op Ola de nitho it 


so that for each y, the Klein-Gordon equation holds provided x # 0. 

If the y. are complex quantities, a. + ib., where ad, and b, are real, 
then analogously to the relation x, = ict, we consider W4 a quantity of 
the form i(a4 + 7b4) and define y* as a four-vector with components 
a, — tbh, (kK = 1, 2, 3) and t(a4 — iba). If dpg*/dxa — dpa*/dxz~ is 
denoted by xas*, then (257), (258), and (259) hold for y.* and xas* 
also, so that, for example, 

OXap* 
OXa 


= Kps* (258’) 
The passage to the case where an electromagnetic field is present is 


accomplished as in Section 46 by substituting in (257) and (258) 
0/d%a + (ie/ch)®. for 0/dxq. Then instead of (258) we obtain 


with 


t wail De aie lt Ae. 
vxat! = (5 + Ge) to —(S + Gh) ve 260) 
Similarly for (258’) we obtain 


186 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


te) te te) te 
1% =| — — —. a ea * 
as om ch ) ¥s a ch ® ) ¥ 
We try to find a four-current with real space components s;s283 and an 
imaginary time component s4 which, because of (258) or (260), satisfy 
the condition 0s./dv2 = 0. The simplest combination by which a 
four-vector is formed from the y. and xq quantities is given by the 


expressions WaXes. Therefore, on the assumption that there is no 
electromagnetic field, we put 


ss = ia(WaXap™* 7 Va* Xap) (261) 


If a is real, these expressions comply with the condition that sj, so, 83 
be real and s, imaginary. They also satisfy the continuity equation, 
for we have 


fs) Oe OWa* OXap* OXe 
se = ia (3H i ee Xas ~ v0 2) 


with 


OXg i 0X8 Pa it Ve 02% Ox 
If the expressions (257) are substituted for xg and xas*, the first two 
terms cancel each other, in one of them the subscripts a and 6 being 
interchanged. Because of (258), the other two terms give zero. We 
obtain for the time component 

& = icp re 1a (Waxes a Wa*Xa4) (262) 
Thus p depends on the ¥. quantities and their first derivatives. As 
the wave equation (258) for Wa is of second order, the first derivatives 
may assume any values; hence both positive and negative values are 
possible for p. Thus, again, the only interpretation of p is that of a 
charge density. When an electromagnetic field is present, x24 and 
Xa4* are to be changed to 


and 


re) 1e te) te 
eer. ween hee d, x J Tee ® Bea 
oe ch ) hy (<5 ch * ¥ 


respectively. 
In addition, we determine the energy and momentum densities, 
confining ourselves, for the sake of simplicity, to the case @ = 0. 
Here the field is described by (258), which, if y. and y,.* are considered 
generalized coordinates of the field, can be derived from the Lagrangian 
fre 


5 Xa Xue — Wa*Pa (263) 


Sc. 49] PARTICLES WITH SPIN 1 187 


The variation dg of We gives 


| Fact eA oe dove 
= Xap 
OXa 


a* We 


and this expression, when substituted in 6 | dt / L dv = 0, results in 
(258). For the momenta 7, and 7.* of y. and ~.*, we find, from (263), 


pasted coischivne Eon on, ah 
“4 8 (Ops/at) ic d (Oda/ams) 
7 (264) 
seg ented trea ga | ed mmmeenty, 
8 (dy; /dt) tc tc 
Since H = St p(dq/dt) — L, we obtain for the energy 
OW OWa* = 
H= [ Xan* Pigs ee ) av (265) 
If, for d~a/0x4, we substitute cx42 + O4/dLe We get 
te) 0 Oy4* 
X4a +e + X40 LW = 2X40*X4a + X40* an + X40 A 
0x4 OLa OXa 


When the last two terms are integrated by parts, we obtain, because of 
(258), +x | 2W4*P4 dv, and then (265) becomes 


H = { (—2eretr — 2npa*Pa + ; Xap* Xap + cte"Ve) dv 
= | eo + Xie * xe — K(Pa*ys — weve | dv 


The summation is from 1 to 4 over the index a and from 1 to 3 over k 
andi. Thus the energy density of the field is given by 


H = —xxX40*X4a + : Xik*xik — K(Wa*ba — Vi*Wi) (266) 


As 4 and w4* are defined by 7(a4 + 2b4) and 7(a4 — iba), then Pa*P4 
and X40*X4« are negative. Therefore (266) is always positive. 

Fi can be developed to an energy-momentum tensor. To do this we 
must form a symmetric tensor of the second order of ¥. and xs which 
agrees with A in the 7'44 element. This requirement is satisfied by 
the quantities 


Tmn = —#(Xma*Xne a Xne* Xin) = K(Vm*Vn st Vn*Vm) cs L Omn (267) 


188 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


which represent a tensor, since ~. and xag form a four-vector and a 
tensor respectively and LZ, as is seen from (263), is an invariant. 
Equation (267) satisfies the condition that Tmn = T nm, and in addition 
we have 7'44 = H, as a comparison with (266) shows. 

The Proca equations describe particles with spin 1. This could be 
shown by a straightforward evaluation of the angular momentum. 
The proof, however, turns out to be rather troublesome. Much 
simpler is the method based on an investigation by Dirac, who suc- 
ceeded in developing the most general wave equation; it can be made 
specific for any value of the spin. It can be shown that, for spin 1, 
this equation agrees with the formalism proposed by Proca. 

50. The Pseudovector Field. We were able to represent particles 
with spin 0 by both a scalar and a pseudoscalar wave field. In a 
similar way, for particles with spin 1, we can use a pseudovector func- 
tion Xs, instead of the vector function. This function, being defined 
as an antisymmetric tensor of the third order, can, as we have seen, be 
specified by the four components x234, x314, X124, and —x123 which, 
when the coordinate system is changed for another, transform like 
the components of a four-vector, the only difference being that a 
spatial reflection of the coordinate system does not change the spatial 
components of the vector. A tensor of the second order can be derived 
from a tensor of the third order by differentiation and contraction, 
and therefore we obtain the counterpart of (257) by setting 


OXapy 
OXa 


= Key = Khas’ (268) 


Wey is an antisymmetric second-order tensor and therefore represents a 
six-vector. was’ is the corresponding dual six-vector the components of 
which are given by Was’ = gy if ad8y are obtained from the sequence 
1, 2, 3, 4 by an uneven number of permutations. (This relationship 
between y and y’ is not changed by a transformation.) Equation 
(268) then corresponds exactly to (257) of the preceding section. 
For example, we have the relation 


9Xa24 _ OX124 4 Ox324 _ ate 

OXa Ox, 0x3 
which agrees with (257) since x124 and x234 are the components x3’ 
and x,’ of the pseudovector xasy. We obtain from Was, by differentia- 
tion, the tensor of the third order d~.8/dx%,, which is not antisymmetric 
but can be developed to an antisymmetric tensor by adding dy,,/ 
OX + OPya2/dx3. The result is a pseudovector which must satisfy 


Sec. 50] THE PSEUDOVECTOR FIELD 189 


the relation 
Oey Wa OWap a 
OLe ” OxXs + Oxy Det 
which corresponds to (258). For example, if we choose 2, 3, 4 for 
a, 8, y, we obtain 


O12 , dis" 4 Oia a 
0x2 0x3 0x4 


which agrees with (258). 
The Lagrangian function L of a pseudovector field is found to be 
s K K 
L= — 5) vases — 3 XaBy *Xaby 


We have pointed out that there is a physical difference between a 
scalar and a pseudoscalar field only if the interaction of the field with 
matter is to be considered. The same remark applies to the vector 
and pseudovector fields. 

Vector fields are subject to the same interpretation as scalar fields: 
they represent a pure wave motion that has nothing to do with par- 
ticles, and they describe only that aspect of reality which can be 
pictured by a wave field. The particles enter the picture only when 
the wave motion is quantized and, as a result of this operation, assume 
the property of a corpuscular radiation. There are, however, certain 
characteristic features of the particle picture which can be recognized 
in the wave picture. The fact that in the latter only a density which 
may be positive as well as negative can be defined can mean only 
that both positive and negative particles can exist in the particle 
picture, and it is the charge rather than the number of particle per 
unit volume that we can apprehend. The explanation of this must 
be that the number of particles has no definite value because of the 
possibility of pairs of positive and negative particles being created or 
annihilated. Jt is characteristic of relativistic quantum mechanics that, 
in all its forms, it insists on processes of creation and annihilation of 
pairs. Only Dirac’s theory seemed at first to be an exception to this 
because it permits particles of negative energy only. Nevertheless, 
states of negative energy have to be considered, and the interpretation 
of these is possible only within the framework of a hole theory and 
leads to the assumption of processes presented to our observation as if 
pairs of particles were created or annihilated. 


190 RELATIVISTIC WAVE EQUATIONS [Cu. 6 


PROBLEMS 


1. Solve the Dirac equation by the plane wave yi = uje“/™ (Zt-p®), Show that 
for a given p and £ there exist four solutions corresponding to E > 0, HE <0 and 
two directions of the spin. ae 

2. Show that the components o10903 of the spin o defined by (240) satisfy the 
equations 

ook + ono; = 2i% 


3. Prove the same for the matrices 
Pi = ta3zc2ay p2 = Bagagay og = B 


Show that these matrices commute with the o;. 
4, Show that o203 — o302 = 2ioy, ete. 
5. If a and b are two arbitrary vectors, it can be shown that 


(+ a)(@-b) = (a+b) +i@-axb) 


6. By using the matrices p1, p2, p3 of Problem 3, show that equation (245) can 
be transformed into 


{wee +m[e-(p-£a)| +oimel y <0 


7. On the left-hand side of the above equation apply the operator 


1 E — ed) - | +-(>-24)| — p3me 
c c 


and show, by using theorem 5 (formulated in Problem 5), that the result is 


2 2 
(2=*) -(p-£a) ~ niet +-m(#+-cut a) 
c c me 
valerate] o- 
2mc 


Interpret the last two terms as being due to an electric and a magnetic dipole. 
8. Show that the Proca equation can be derived from ((] — k?)y¥; = 0, together 


with (256) and the condition » (avi/ax:) = 0. 


9. Discuss the difference in meaning between a non-relativistic and a relativistic 
wave equation. 


4 
QUANTIZATION 
OF WAVE FIELDS 


51. The Idea of Quantization. Our next task is to bring the 
theory of wave fields into contact with the concepts of quantum 
mechanics in order to find a way to introduce the particle idea. This 
can be achieved by interpreting the field quantities, which have been 
represented by certain tensors or spinors in the preceding chapter, as 
matrices which must satisfy certain commutation relations. In this 
way a field is given properties that can be expressed in terms of whole 
numbers, thus providing an explanation for the fact that a measure- 
ment, for example, of the charge always furnishes an integer multiple 
of a fundamental unit, so that the field is behaving like an assemblage 
of particles. 

This quantization can be brought about by requiring the wave 
functions y; and the corresponding momenta 7;, which are taken as 
matrices, to fulfill the commutation relations of (136). In the follow- 
ing, however, we prefer to transform y; and x; into a denumerable set 
of coordinates q, and momenta p;. For this purpose we first deter- 
mine the Lagrangian function L from which the given wave equation 


can be derived by means of 6 | Ldt=0. The next step is to expand 
the wave function in terms of an arbitrary set of orthogonal functions, 
Se(xyz): 
V(axyat) =) gu(t)fi(zye) 
E 


the consequence being that the wave equation becomes an infinite 
system of equations for the q;(¢) terms. L then becomes a function of 
the q; terms, which are the generalized coordinates of the field, and of 
the derivatives dq,/dt, that is, L = L{qx (dq./dt)]. From this function 
the momenta p,;, which are conjugate to the coordinates gz, can be 
determined by means of the relation p, = 0L/0 (dgqx/dt). Then for 
the Hamiltonian of the system we obtain 


d 
H = > (# Pk -1 0 (@ | = H(qppx) 


191 


192 QUANTIZATION OF WAVE FIELDS [Cu. 7 


The wave equation can now be transcribed into 
dq, _ 9H dp, _ _ 0H 


dt op, dt =—Ss oe 


In order to quantize the system the quantities g, and p, must be 
replaced by matrices Q; and P,, which satisfy certain commutation 
relations. Care must be taken in setting down these relations. 
According to quantum mechanics, (136) certainly holds for real con- 
jugate observables, which are represented by Hermitean matrices. 
But in the theory of wave fields we are frequently concerned with 
gz and p;, terms which are complex, and hence we must investigate 
first whether we can apply (136) to this case as well. Let us assume 
that y is a real or complex wave function and that x, the conjugate 
momentum function, derives from the Lagrangian L = i dvL by 
differentiation of L relative to dy/dt, that is, r = dL/d(dy/dt). We 
imagine that the field is enclosed in a cubic space of extension J, and 
assume that the field periodically extends beyond this space. Then 
y and w undergo a Fourier expansion if we choose the real functions 
Fy =o sin 
Bye “of which are 2/1 times a whole number nin2n3, so that 
k;+r = (2x/l)(niz + ney + ngz). The physical meaning of the 
expansions 


k;+ r as an orthogonal system. k; denotes a vector the 


v=)ah =) vif (269) 


is that of a resolution of y and x into stationary waves that have the 
directions of the k; vectors and a wavelength given by lk,| = 2r/dj. 
Two opposite directions are taken as one, so that only the directions 
of a hemisphere need be considered; therefore, for two of the nj, 
we must take both the positive and negative whole numbers, but for 
the third only the positive numbers. The f; functions form a complete 
orthogonal system by satisfying the condition | dv fife = Six. The 
q; and p; terms are real or complex depending on whether y is real or 
complex. If complex, we pass from the complex q; and p; terms to 
the real coordinates g;“q; and real momenta p,;‘?p; by means of 
the transformation 


(1) {) 


1 
G= GY +i) g* == | — ig; 
V2 ve 70) 


1 * 
Pe (ps? Bah i pi* = a (pi? + ip) 


Sxc. 51] THE IDEA OF QUANTIZATION 193 


The factor 1/2 is necessary in order that q;Pp; and q;p,;” 
become conjugate together with q:p;. If H(qpq*p*) is the Hamil- 
tonian of the field (always assumed real), then, according to (270) we 
obtain 


api? ~ ap; yet ape * i api? — »/2\op: ° api* 
1 ‘a 16 9) 
V/2 E+ dt ee 
so that the q;“? and p;‘? also satisfy the canonical equations. The 


matrices Q;"?, P;“?, Q;, P:® which belong to g;‘?p; and g;p, 
respectively must then fulfill the equations 


dH aH ap; , aH ap* 1 (2 Y oH) 


(Q/P "| = ” binds (0/Qe"}=0  [PePy’] =0 (271) 


(where such an expression as [ab] signifies ab — ba). From this it 
follows that, for the complex matrices belonging to q; and p,;, 


(Q:Px] = 51a” + iQ;, Py? — iP, ] 
— 5 (0% (Dp had += 5 (@ {Ppp ] te * 8 in 


(Q0:.)=0 [PP] =0 (272) 
Thus the expansion (269) in terms of the real orthogonal functions 

2 sin 

13 cos 
holds regardless of whether y is real or complex. 

Frequently, however, it is convenient to expand y in terms of the 
complex functions (1/ Vi) Kk" For this parpet we substitute 
(1/22) (e™** — e—*'*) for sin k; + r and Ye (e™ + e—**) for cos kj + r 
the effect being a decomposition of the stationary waves into running 
waves in the directions k; and —k;. Instead of (269) we will have 
then 


k; + r leads to coordinates g; and momenta p;, for which (272) 


1 
= Vi qie™** r= FA » pie* (2738) 


The summation is now to be extended over all directions 7, not only 
over those of a hemisphere. Thus the range of the index 7 is from 
—« to +o, the two indices +7 and —7 signifying two opposite 
directions, It is convenient, in the expansion of 7, to coordinate p; 


194 QUANTIZATION OF WAVE FIELDS [Cu. 7 


toe", The functions u; = (1/ V8) eke, like the f;, form a normal- 
ized orthogonal system, since 


[ wtur dv = ba (274) 
For a real y, for any value 7 we have 


qi=Qa* pi=pi* 


so that the two terms g_,e~*** + q;e“** provides a real value. 

In order to set up the commutation relations for the matrices 
Q;, P; of q; and p;, we form the commutator [pz] and integrate over the 
field space. If we designate the matrices considered by Q;’ and P,’ in 
order to distinguish them from Q; and P;, we have 


J tala = [ [Do ),Pr'ua* | do 
= {Yous Yrate] a 


Hence, if we take into account (274), it follows that 
> (a? = ) tera (275) 


The number of terms on the left-hand and right-hand sides is the same; 
for, although we summate on the right only over the directions of the 
hemisphere, every direction is to be taken twice, the function f; appear- 
ing as sine and cosine. Therefore (275) requires that, since [Q;P;] = 
(h/t)EH, then [Q,’P;'] must also equal (h/i)E. Correspondingly, 
from the fact that [yy] = [rz] = 0, it follows that [Q,’Q_,’] and [P/P_,’] 
must vanish. Hence the matrices Q;’ and P,’ may, in general, be 
assumed to satisfy the relations 


(QPL) = bak (QQ =0 PPY1=0 (76) 


52. Quantization of a Scalar Field. The procedure will be 
applied first to a scalar field. (The field may be pseudoscalar as well, 
but then a pseudoscalar and pseudovector must replace the scalar and 
the vector.) In this case, according to (250), the Lagrangian is given 
by 


L = —« { WY + xe*xe) do (277) 
[Note: xa* signifies (1/x)(dy*/dzr_) and not (1/x)(dv/dx_) *] In order 


Sze. 52] QUANTIZATION OF A SCALAR FIELD 195 


to transform L into a function of the coordinates g; and their deriva- 
tives dq;/dt, we introduce the momentum field 7, conjugate to p: 


aL 1 oy* 


c= oO 2 SS a* 1 oy 


a (dy/dt) xc? at ~ Ke? Ot 
For the Hamiltonian of the field we then obtain 


H = [ (Ar4 4) a0 -1 


-| (Ce dhiad Ni inte #) as 


K OX; 02; 


in which the summation of 7 is from 1 to 3 only. If we expand y and 
a in terms of running waves, H becomes 


k;? 
H = » | xen: ix + H) ata (278) 


The canonical equations arising from H, 


dq; _ OH _ ae a dp; _ oH _ ( Ht * 
oP Nem a ag: ony % 


are identical with the field equations for they lead to the equations 


dq; ( x) 
dz al, oa K % 


from which, on returning to y with the help of (273), we arrive again 
at the equation 

1 a*y 

aaa 

The momentum G and the charge e« can be expressed in terms of 

q; and p; just as was the energy H. If, for the sake of simplicity, we 
assume ¢ = A = 0, then, according to (253) and (249’), the momentum 
and charge density are given by 


hgh (+ oy , ay* oy 
wer aes Oto Ot Oba 


ote, 62h) 2") 
an (ve ary 


As we see, p = 0 for a field with a real y. By integration, we find for 


196 QUANTIZATION OF WAVE FIELDS [Cu. 7 


0-5 Lafe()- (2 
--85[@)-() 


We now quantize the field by transcribing q; and p; into the matrices 
Q; and P;, which satisfy the relations of (276). Since gq; + q;* is 
real, the corresponding matrix must be Hermitean. To satisfy this 


condition we have to translate g;* into Q;. We then obtain for H, 
G, and e the matricest 


tm by Eze + (. > #) aa.| 


G= -i ) k,(Q;P; — Q,P,) (279) 


G ande 


on 


€ 


—iak ») (QP; — Q,P;) 


We shall show now that there are matrices, Q; and P;, satisfying the 
required commutation relations with which, if the coordinate system 
K of the Hilbert space is suitably chosen, the matrices H, G and 
become diagonal simultaneously. For this purpose we introduce 
new matrices, A; and B;, which are defined by 


. ch 1 
Q: = -1 Vo yeter (A; — B,) 


4 
h k? 
Pi = > 1 + ~5 (A: + By) 


It is proved easily that all the requirements of (276) are fulfilled if 
the matrices A; and B; are chosen in such a way that 


[A:Ai] = 628 [B:Bi] = 548 (281) 


with all the other bracket terms such as [A;A;,], [A:B;], - - - being 
equal to zero. Then we have, for example, 


(Pil = — 5 (4-4e) — BB) + (AB) - (BA) = —a8 


{Note that the succession of the factors in the formulas are not uniquely deter- 
mined by the condition that H, G, and « be Hermitean. 


(280) 


Src. 52] QUANTIZATION OF A SCALAR FIELD 197 


It can be shown in the same manner that the other relations are 
satisfied. 

The relations expressed by (281) are exactly the same as those we 
encountered in Section 43 for the treatment of the many-body problem, 
the only difference being that here the relations involve two inde- 
pendent sequences of matrices, A;A; and B;B;. This independence is 
made evident by the commutability of the pairs. As in Section 43, 
we now choose the coordinate system K of the Hilbert space relative 
to which all the products 4;A; and B;B; are diagonal. Then any 
axis of K can be marked by two sequences of numbers, ny*ng*ns* - 
and ny-n2-ng~ + + * , signifying that the axis belongs to the n;‘-th 

eigenvalue of Ay A, the ng*-th eigenvalue of A,Ae, and so on, and 
similarly to the n,~-th eigenvalue of B,B;, and so on. Then, accord- 
ing to Section 43, the eigenvalues of A;A; and B;B; are given by 


(A;A;)aing = (A:Ai)nytngt-. mints = iT — (BBidaue = Ti 


ny te *° Ne °°* 


whereas all non-diagonal elements vanish. 
If now, in the expressions (279), A; and B; are substituted for Q; and 
P;, we obtain 


2 
n= wh 14+ (4,4 B)(4: +B) 
F C K 
h ks? 
+e 1 + =y (Ai — Bi)(A — Bi) 


1 
is » he Vic? + hi? (Aid: + AB; + BA; + BB; + AA; 
— B;A; — A;B; + BB). 
Since, because of (281), A;A; = E + A;A;, the expression in paren- 
theses reduces to 24;A; + 2B;B; + 2E, and with x = moc/h, we obtain 
H =) cV moc? + keh (AA; + BiB: +E) (282) 


Now, according to (228), ¢ V moc” + k,’h” is the energy EH; of a 
particle with rest mass mp and momentum k,h. Therefore (282) 
represents H as a matrix diagonal in K the eigenvalues of which, 
belonging to the axes ny*ngt ++ + ny ne ° * *, are given by 


y Ex(nz* + n=~ + 1) 


198 QUANTIZATION OF WAVE FIELDS [Cu. 7 


The physical interpretation of this is that, when we measure the energy 
of the field (we make the vector le sap the state take the direc- 
tion of an axis nytngt - + + ny~ng~ - - - ), the result is the same as 
if the system were composed of particles which are capable only of 
energies E; = c V moc? + k; em. The axis ny*ngt + + + ny~ne— 
represents a state in which n;* + n;~ particles are observed to have 
an energy Ey, not + ne” to have an energy Ez, and so on. 

A corresponding interpretation is possible for the momentum G and 
the charge «. We find 


G = 4) kik ((y — BOCA: + B) + (Ac — B)(As + BD) 


Dy kA(A.A; — BB) 


(283) 
€ = ak ¥ h(A;A; = B;B;) 
re » sheds BB,)nn orem oe 
: ite 
This means that in the state nytng*t - - + ny~ng~ + - + the system 


has a total momentum 2 kinins — mn, ) and a charge « = > e(ny* 
—m). Thus the numbers that originate in the quantization permit 
the interpretation that n,;* is the number of —- having the charge 


+e, momentum +k;,h, and energy ¢ Vmo’c? + k;*h”, whereas n;- 
counts the particles with charge —e, momentum —k,f, and energy 


cV moc” + k;*h?. Particles of negative energy do not occur in the 
theory because the Hamiltonian can have positive values only. 


When the field is real, we have L = —x (v? a Ny x"), hence 7 = 


(2/xc”)(dy/dt). Formulas (279) must, therefore, be changed into 


H = 3 ¢ xe?P;P; + (« + ee) G0.) 
—i “ Q:P; — QP:) 


45 2 GOP; — QP) 


G 


n 
ll 


Src. 53] QUANTIZATION OF A VECTOR FIELD 199 


In this case, Q;, as defined in (280), is multiplied by 1/ V2, and P; in 
the same expression by V2. Then we obtain 


H = y $e V mo’c? + k*h? (A;A; + BB; + E) 


G 


® $k,h(A;A; — B;B,) 


iaxt ) (A:A; — B:Bi) 


For a real field we have g_; = 9:*, p-« = p:*, and therefore Q_; = Qi, 
P_; = P,. In order to satisfy this condition we have to set 


A_; = B; and B_; = A; (284) 
In the expression for H the terms arising from 7 and —7 then give 


4c V morc? + kh? (A:A; + BiB; + A_,A_i + BB) 
=C V moc? + k,7h? (A;A; + A_;A_;) 
so that we get 


H = "3 cv moc” + k 7h? (4.4 + _) 
G = ) khAsA; 


e=0 


Thus the matrix B may be dropped in the case of a real field, and the 
coordinate system K of the Hilbert space may be reduced to a system 
the axes of which are marked by one sequence n n2 ° - - of numbers 
only. When the vector representing the state has the direction of an 
axis nyng + ~:~: , the field consists of nyn2 - - - uncharged particles 
with the energies ¢ V mo’c? + ky"h?, c V moc” + keh”, - - - and 
momenta kyh, koh - + + respectively. 

53. Quantization of a Vector Field. For the quantization of a 
vector or pseudovector field it is convenient first to eliminate the time 
component 4 from the equations, the corresponding momentum 
being zero according to (264) anyway. According to (258) and (264), 
we have 


200 QUANTIZATION OF WAVE FIELDS [Cu. 7 
and so the expression for H may be changed to 


c 0 -" On. 
A = xo?a;*e; + ~ xsn* xix + — SE + ith (285) 
2 K O02; O02, 
The summation over 7 is to be from 1 to 3 only. The components of 
the field momentum g, are determined by the tensor (267) to be 


1 : , 
Qu = = Ty = = (xes*x4s + X4s*Xai) + — (Wa*s + Pa* Pa) 
ic c c 
ok 5 
= « (Xat*m;* + Xaiti) — (2 oe + a ors) (286) 
Ox; Ox; 


On the other hand, the charge density, according to (262), is given by 
the formula 
p = ta(yir; — Y,*n;*) (287) 


from which we see again that p = 0 for a real field. 
The expressions (285), (286), and (287) contain only those quantities 
¥1, v2, ¥3 and the corresponding momenta 7, 72, 73 which can be 


associated with the vector functions P and z. In order to introduce 


denumerable coordinates, ¥ and x are expanded in terms of a complete 
orthogonal system which now must consist of vector functions. To 
attain this end we multiply by the unit vector e; the scalar functions 
(1/ Vi8)e** used in the preceding section. The product then 
represents a wave polarized in the direction of e;. The vector k; 
defines the direction of propagation and has the components (2r/l)ny, 
(2m/l)n2, (2r/l)n3, where nj, n2, n3 denote whole (positive and negative) 
numbers. For a longitudinal wave, e; has the same direction as k; 
(thus making e;k; = k;) and is designated by ej, whereas, for a trans- 
verse wave, e; is perpendicular to k; (meaning that e;k; = 0). All 
directions perpendicular to k; can be represented with the help of two 
directions e;2 and e;3; which are perpendicular to each other, so that 
for a given 7 there exist three vector functions, which may be designated 


by uz = (1/ v/)e;je%? (j = 1, 2, 3). The < and x are represented 


by 
ge ay, a bee COR Fad, pijeyje™ (288) 
/3 i jij V/78 - Mt 
ay 


from which follows, for xa, 


————E— 


Sxc. 53] QUANTIZATION OF A VECTOR FIELD 201 


Vs We 1 2 
Oe ee Bs a > qezt(eng Pky — e4; hs )e*e* 
Fy 


if e;,; and k;“ denote the components of the vectors e;; and k; taken 
in the direction a. The expression in parentheses is a component of 
the product e;;*k; and, therefore, only for a transverse wave is not 
equal to zero. Accordingly the integration of 44xi,*x:% over the cubic 
space gives the value 


1 
2 S Qis*qiski?(1 — 5,1) 
rr 


where 6;; = 1 for j = 1, and 0 for j = 2, 3. On the other hand, we 
obtain for d;/0x; 


; i ae 
Ore iD, Palen? fie grt 
ij 


02; 
a —iker 
- VE Pij (Ci; ‘ k;)e ML 


ij, 


so that for the integral of (d7;*/0x;)(07;,/dx,) we obtain 
” Dij*piski” dr. 
tj 


This gives us for H, G, and e 


ki? 
H = } Dij* ize” (: + Se in) 
rr 


k? 
+ » Qik * Qik E + ip (il — an) | 
; (289) 


ij 
G= >, (assis — 95;*Di;*) 


aj 
«= >, ta(assmis — 4i;*pii*) 
oo] 


In order to quantize the system we substitute the matrices Q;; and 
P,; for q;; and p;;, requiring that Q;; and P;; satisfy the relations 


(QuPrs] = 7 badieB — (Qerl = 0 — [PyPry] = 0 (200) 


Passing again from Q;; and P;; to the new matrices A;; and B;;, which 


202 QUANTIZATION OF WAVE FIELDS [Cu. 7 


now are defined by the relations 


+ eal an 
Qi; = ~ hi ja o(+Eo) ———_,——— (Ai; — Bj) 
— fe — 651) 
(291) 
4 
K ae — ma — 651) 
h 
Pi; = ne (Ai; + Bis) 


c? ( + ii in) 
K 


he 
(QisPe7] = — 5 Asi — BiyAry + Bus 


we have 


Thus the conditions of (290) are satisfied if 
[A,jAx;] = bi~ 65H (Bi;Buj] = Six 657H (292) 


with all other bracketed terms being equal to zero. The expressions of 
(289) now can be written 


a SNe(oEa)[eBo a)» 


[(Ay 7. B;;) (Ai aia Bij) IF (Ai; i) Bij) (Aaj ge B;;)] 
= > he Vx? + ky? (Aj;As; + B;;Bi; + E) 


ij 


(298) 


G 


is kh(A,;A:; — B,;Bis) 


€ ay (A:;Ai; — Bi;B:3) 

a 
As we did in Section 52, we refer to that coordinate system K of the 
Hilbert space in which all the products A;;A;; and B;;B;; are diagonal. 
Any axis of K can then be marked by two sequences of numbers, 
nij* and n,;j~, and the elements of the products A;;A;; and B;;B;; are 
given by 

(AijAss)ang = jt — (BisBis) aug = Maj 

with all the non-diagonal elements being equal to zero. In a state 
represented by a vector having the direction of an axis n;j'n;;_, H, G, 
and ¢ have the values 


Suc. 53] QUANTIZATION OF A VECTOR FIELD 203 


H= bs eV moc? + 7h? (nig* + naj) 
yj 


G kjA(nj* ~ nj ) 
} 


€ a) (nt — nj) = e), (nij* —mj) fora=e 
ij a 


This means that, if the vector representing the state of the field has 
the direction of an axis of K, the field behaves like a system of a nit 
ij 


particles with a charge +e and > ag particles with a charge —e. 


wy 
All the n,;* and n,;;~ particles which belong to the same axis i have the 
same energy ¢ V moc” + k;"h?, and the momenta are +k; and 
—k,h respectively. Because of the index j, two kinds of particles 
occur which may be transverse or longitudinal. 

If y is real, we must again, as in Section 52, take two matrices, 
A,; and B;;, for which 

A_.; = Bi; B_ij = Aij 
Then the charge is zero, and the terms with B;; in H and G are to be 
cancelled. 

As a final precaution, let us emphasize again that the relativistic 
wave equations must be understood in a sense entirely different from 
that of non-relativistic theory. The non-relativistic y has only the 
significance of a probability function which is represented by’ the 
picture of a wave motion. In contradistinction to this, the relativistic 
y refers to a real and not a symbolic wave field that possesses energy, 
momentum, and charge. This interpretation implies that the rela- 
tivistic field, as long as it is not quantized, is not associated with the 
idea of particles, an idea which in non-relativistic quantum mechanics 
constitutes an essential supposition. The non-quantized relativistic 
theory is competent only for those experiments in which particles act 
like waves. Only in the quantized theory do particles appear because 
the quantization creates properties of the field which can be expressed 
only in terms of whole numbers. In this way the theory succeeds in 
explaining that peculiar ambiguity which is to be observed in matter 
and light. The consistent interpretation of this ambiguity is based 
on the recognition that quantities that are measured relative to the 
wave and the particle picture respectively are not commutative, the 
consequence being that, when a quantity referring to the wave picture 
is measured, all quantities associated with the particle picture become 
unobservable and vice versa. 


8 
QUANTUM ELECTRODYNAMICS 


54. Classical Theory. The Field as a Superposition of Plane 
Waves. Now we shall apply the formalism of relativistic quantum 
mechanics to the process of light. In Maxwell’s theory this process 
is described by a set of wave equations which are adequate for a com- 
plete understanding of all those experiments in which light displays 
the nature of a wave motion. But in other experiments light displays 
the properties of a corpuscular radiation, for in these any apparatus 
capable of measuring the energy and momentum of a radiation always 
registers integer multiples of certain fundamental unities. This 
ambiguous nature of light cannot be understood on the basis of a pure 
wave theory. However, a consistent explanation can be given if, on 
interpreting the field quantities as matrices, we quantize the wave 
equations. The field retains its wave character but simultaneously 
takes on such qualities that in its interaction with matter it behaves 
like a corpuscular radiation. Thus it is to be hoped that by this 
quantization we may arrive at a theory that is consistent with experi- 
mental facts. 

As a beginning we shall develop the classical theory of light. Ac- 
cording to Maxwell and Lorentz, the field corresponding to a given 
charge and current distribution is described by the equations 


cul B= — 2 div E = 4rp 
(294) 
1 
See ee ee ee divH =0 
Cc c dt 
If the last equation above is solved by setting 
H = curlA (A = vector potential) (295) 
the first becomes 
1dA . 1dA 
om (z+?) - syd OH — grad ¢ (296) 


where ¢ is the scalar potential. From the other equations we then 
204 


Sxc. 54] CLASSICAL THEORY 205 


have 
ig EE a RG tito 0 (297) 
1g vin + gna (div a +34) 2 
a. V°A + grad div A +77 wise (298) 


The potentials A and ¢, associated with a given field EH, are not 
uniquely determined by (295) and (296); for, if u(xyzt) is an arbitrary 
function of coordinates and time, E and H are not changed when A 
and @¢ are replaced by A’ = A+ grad pw and ¢’ = ¢ — u/c (gauge 
invariance). Thus by choosing yu in a suitable way, we may cause the 
scalar potential to vanish. Then E becomes 


_ _idA 
a eel ak 
whereas (297) and (298) become 
. div A + 4xp =0 (297) 
=A — V*A + grad div A = = pv (298’) 


Now for the continuous field quantity A(xyzt) substitute a denumerable 
set of coordinates q(t) by expanding the potential A of the radiation in 
terms of the functions e;;e"*". (The radiation is imagined to be in a 
cubic cavity of extension /). In this manner the field is resolved into 
plane waves. Any of the vectors k; is specified by three whole num- 
bers, 71, N2, ns, Which may be positive or negative and which define the 
components of k; by k;“ = (2r/l)na. If we write the product 
k;+r = (2r/l)(mix + ney + ngz) in the form (21/d;)(ax + By + 72), 
where ); is the wavelength and a, 8, y the direction cosines of k;, we 
obtain 

1 ny? + ne? + 3? 

ie ms (299) 
Letting »v; be the frequency associated with \;, and defined by 2z/r, 
7 being the period, then c/y; = d;/27, and thus 


y= V ny? + ne” + 03” (300) 


If we normalize the functions e;;e"*'* by means of the factor V 4xc?/1°, 


206 QUANTUM ELECTRODYNAMICS [Cu, 8 


[axc* ‘3 
Uz = re e,je" 


i Ui; UK; = 4rc* biz O57 (801) 


the functions: 


satisfy the condition 


and we obtain for A the expansion 
— D gesmss (302) 
a 


The index 7 refers to the direction of propagation of the plane wave 
u;;, two opposite directions being denoted by 7 and —7. In (302) all 
positive and negative integer numbers must be considered. On the 
other hand, the index 7 determines the direction of polarization, 
j = 1 signifying a longitudinal wave and j = 2, 3 a transverse wave. 
For j = 1, e;; has the direction of k;, and therefore e;;xk; = 0; for 
j = 2, 3, e+k; = 0. Consequently, in the case of a function u,;, 
when j = 1, curl u;; = 0; and when j = 2, 3, div u;; = 0. In what 
follows we shall designate the longitudinal wave u;; by w, and the 
transverse wave U;2 or U3 by uw. Thus, for all waves w, curl uw; = 0; 
for all waves u;, div u; = 0. Hence (302) may be written 


A=A:+ A: = Dam + > gen 
i f 


Substituting this value of A in (298’), keeping in mind that because of 


(300) we have for any value of u 


v2 = ve 
PEs Srilinek ve 


and also that, since curl curl = —V? + grad div, for any m4 
grad div u, = V?uy 


1 d, 1 a 4a 
2 ) (Ge + vi?qe)ue + a ) Guu = — pv 
t l 


On multiplying this by one of the u,;;* and integrating over the space 
of the enclosure, we obtain, because of (301), 


we obtain 


1 
Ge + vege = a | ur*pv dv (303) 


ji = : i uy*pv dv (304) 


Src. 54] CLASSICAL THEORY 207 


It turns out then that the transverse and longitudinal q;; terms satisfy 
two different differential equations. Whereas (303) agrees in form 
with the equation for the forced vibrations of an oscillator, equation 
(304) corresponds to the motion of a free point mass on which a force 
isacting. Because of this difference the transverse part of the field can 
be quantized; this is impossible for the longitudinal part of the field. 

To evaluate the right-hand sides of (303) and (304) we assume that 
the matter consists of particles with charge e. Then we have for the 
point particles 


| u*pv dv = ) evwu"(Pa) 


where v; denotes the velocity of the kth particle and u(P,) is the value 
of u at the point P, occupied by that same particle. 

The next step is to derive (303) and (304) as canonical equations 
froma Hamiltonian H. We maintain that H is given by the expression 


H = » (pe*pe + ¥47Q1*Q1) + » pi*pi 
i> 


1>0 
2 
+ » calc? + E = = A(P) | (305) 
k 


the summation to be extended over the positive values of ¢ and / 
only, that is, only over the directions of the hemisphere. The same 
holds for A, which, we imagine, is expanded in the form 


bl (qeue + qe*us*) + : (qiuz + q*u1*) 

t>0 1>0 
such an expansion being possible because g_; = q;* when A is real. 
The third term in the expression for H represents the energy of the 
particles, the quantity A(P,) in this term signifying the value of A 
at the P; position of the kth particle. Equation (305) is a function of 
qpq*p*, and also it depends on the coordinates x,y,z; and the momenta 
px of the particles. In order to prove that H represents the Hamil- 
tonian of the system, we consider the canonical equations which result 
from (305): 


a s 
dt Op: dt 0”: 
dn _ oH = dy _ ald a 
dt Opi dt Ogi 
dx, oH dp,* oH 
ee. aps” dts Oa ) 


208 QUANTUM ELECTRODYNAMICS [Cu. 8 


Taking into account (243), (a) and (b) above give 


d (p: -£4) w 
au = p:* 4 —=¥s era 
pie ts 


—v?q:* + ye ° veur(Px) 
(806) 


k 


dq 


dpi é 
Ver oe) See Ls 
a Pt 7 - veur(P) 


From these equations we get the same values for g; and @ as from (303) 
and (304). The (c) equations determine the motion of the particles, 
as demonstrated in Section 47. 

Instead of equation (305), the energy of the system can be repre- 
sented also by 


1 
H= ye (pe*pe + ve7e*Q) + »3 pi*pi 
t 
re 
+ y 2 calmc? + (» — a) (305’) 
E 


in which the summation is taken over both the positive and negative 
values of ¢ and I. 
55. Transformation of the Hamiltonian. The first two terms 


of (305’) represent the field energy, U = (1/87) | dv (E? + H?), 
for we have 


2 
Wala A 


= {te S) + eet) | 


Because of (301) and (306) the first term on the right gives 


dq*dq_1 Maat | : 
dt dt =3) pet3) Pt Pl 
t l 


Src. 55] TRANSFORMATION OF THE HAMILTONIAN 209 


On the other hand, the second term can be transformed into 
+) vege* ge 
t 


We can effect this transformation with the help of the relations 
} curl u,; curl uz dv = } [w; curl uz)n df + / (uz * curl curl uz) dv 
and 


2 
ve 

eurl curl uw = —V?u% = > wu 
c 


keeping in mind the fact that the surface integral vanishes because of 
the periodicity of uw and the derivatives of wu. 
It is important that the longitudinal part of the field energy 


4 yd pi*p: turns out to be identical with the energy corresponding to 
7 
the Coulomb interaction. This can be proved with the help of (297’), 


according to which (l/c) div dA/dt = —4rp. Owing to the fact 
that div u; = 0, we have 


. aA dq ,. : 
div TD, ap iv tw = )) pit div a 


1 1 
On the other hand, because e; +k; = k; = 2x/),, we have for div ur 


4 Qe [4c * Vy 
gi lees ae mie, 


According to Fourier’s theorem, f; = V 4xc?/l° e*'* forms a complete 
orthogonal system since any scalar periodical function can be repre- 
sented in terms of it. (For vector functions the orthogonal system is 
given by the uy terms.) The f; terms satisfy the normalization 
condition 


/ Si*fi dv = di4mnc* (307) 
Equation (297’) now becomes 
dA ae 
div i. pi*fi = —4rpc 


On multiplying this equation by f;* and integrating, we get 
ivip* = — 7 pfi* dv = — Y hr) 
E 


210 QUANTUM ELECTRODYNAMICS [Cu. 8 


where f;*(P;) signifies the value of f;* associated with the position P; 
of the kth particle. Accordingly we obtain 


1 he ee fr*(Px)fi(Ps) 
a), MPT FD Dy v7" 
lL ik 


In this expression the summation is over all 7 and & values, and thus 
any ik combination occurs twice (i # k). Hence two particles, 7 


and k, contribute 
*(Px)fi(P: 
rey Si Papi) 
1 


which depends on the coordinates of the point P; and Py. If we con- 
sider H, a function of P;, the point P; being considered fixed, we obtain 
for V*H;x, since V*f = —(v?/c?)f, 


2 2 
EET atuanisd =), h*(Ppfh(P) = — 5 Ane 8(P; —P;) (308) 
l 


The function 6(P; —P,) vanishes for P; ~ Px, but for P; = Px it 
becomes infinite in such a way that the integral of 6(P; — P;), taken 
over an arbitrary small domain surrounding P;, has the value unity. 

From (308) it follows that H;, = e?/r, where r is the distance of 
the points P; and P;, for e?/r considered as a function of P, agrees 
with H, in satisfying the equation V7(e?/r) = 0 for the case where 
P; is not equal to Px, but, when P; is equal to P,, 


He e” e” - 
Ma pia dv grad “dy = — pid = —4ne 


Thus we get 
1 e* 
po rey ee Sa. 
5 ) pi*pi a (309) 
7 i> 


That is, 4 : pi*pr is identical with the Coulomb energy of the point 
7 


charges contained in the field. The term with 7 = k represents the 
longitudinal, or electrostatic, self-energy of the particles. For point 
particles this self-energy becomes infinite. 


If we substitute i e”/r;, for the second term in equation (305) of 
the Hamiltonian, the canonical equations (306) lose their validity, for 


Src. 55] TRANSFORMATION OF THE HAMILTONIAN 211 


in the equation for dp,“ /dt, for example, a new term arising from (309) 
must be taken into account. It can be shown, however, that the 
equations of (306) remain valid (except for the second one, which 


becomes meaningless) when, after }4 ) p,*p; has been replaced by 
: e”/r, in (305), in the third term only the transverse part of A 


(A. = x) is taken so that the q, coordinates as well as the p; 
are eliminated from H. We then obtain for H 


1 7 ae e” 
H=>; (pe*pe + ve"Qe*Qt) + oe, 
ri it ** 


2 
+ . c A{m?c? + (m _ ‘a,) (310) 
i 


The equations for dq;/dt and dp;,/dt in (306) then remain unchanged, 
whereas those for dq;/dt and dp;,/dt are eliminated. On the other 
hand, we obtain for dx;,/dt and dp,"/dt 


ay ¢ on ) 
dz, oH e(p. eo? 


Pe ia ey ear (311) 
mc” + pi= Pn 


e( fy, oA 
dp,™ re e” er ig oe 
dt Ox, Ox, Tik v. 2 
mc” + \ pe — - At 
] e é ( 2As) 
9 ar fee + c\Y” azz (312) 


If we subtract from (312) the equation 


edA,® & (eae ff »,@ aA,” 


OF dbo te\, tea ax, 


then, because H = curl A, we obtain 


d ( fe ie a e e0A,” e 
wes —-A, oe ae ee! ae -lyxH 
He c dx, LY re c ot a c sali 


212 QUANTUM ELECTRODYNAMICS [Cu. 8 
Because of (311), the left-hand side is 


d _m(dz,/dt) 
dt V1i— v,?/c? 


On the right-hand side, according to (296), —(1/c)(dA,/dt) means 
the field strength E due to the transverse field and the first term 
determines that part of eE which is caused by the Coulomb forces, 
that is, by the longitudinal field. Again we obtain the equation of 
motion 

d mv 


Se a 
dt V1 — v?/c? 


by which the correctness of (310) is proved. 

The significance of the three terms which, according to (310), com- 
pose the energy H, is immediately clear. The first term gives the 
energy of the transverse light waves, which can be treated like a 
system of oscillators. The second term refers to the Coulomb inter- 
action between the particles, and the third part represents the kinetic 
energy of the particles together with the energy that is due to the 
interaction of matter with the light waves. 

56. Quantization of the Field. The classical theory we have 
developed above can account only for the wave phenomena of light. 
In order to describe also those processes in which light displays the 
nature of corpuscles, the field must be quantized. For this purpose 
we consider first the field by itself. The Hamiltonian of the field is 


1 2 
H= 3 » (pi*pe + v47Qs*ae) + be . (313) 
t <s 


p+}ysH) 
c 


In order to quantize the field we must transcribe the q and p factors 
into matrices Q and P. Now, in (313), only the transverse factors 
q, and p; occur, the corresponding longitudinal factors being completely 
eliminated since it has turned out that their energy contribution can 
be described by the Coulomb interaction of the particles. Thus 
only the transverse and not the longitudinal part of the field can be quan- 
tized. This corresponds to the evidence that only transverse waves 
display corpuscular properties. Therefore we only have to deal with 
the first term of (313), which upon quantization becomes 


H =4) (PP: + QQ) (314) 
t 


Ol 


Sc. 56] QUANTIZATION OF THE FIELD 213 


The summation is to be extended over both positive and negative ¢. 
According to (276), Q; and P; must satisfy the relations 


[QP] = : bv E [QQ7)=0 [PP] =0 (315) 


Following the method of Sections 52 and 53, we solve the problem by 
setting 


Q: = ih (A, — B:) Pi = “ (A: + B,) (316) 


As the field A is real, we have Q_, = Q;, P_; = P, and, therefore, 
according to (284) 
A_; = Bi By = A: (317) 


To satisfy the conditions of (315), it is necessary that 
[Ardy] = bE [B.By] = bE 


with all other brackets being equal to zero. By substituting (316) in 
(314), we get 


H = ; y Ba + B,)(Ar + B,) + (A: — By) (At — BY 


= : 3 hy, (A.A; oe BB, + EB) = Py hy; (a4. -+ 2) (318) 
t t 


because, since B_; = A;, B_, = A, and v_; = , the following trans- 
formation holds: 


7 1.B.B, = p. vj»A+A_+ = pi vrArAt 
7 t 


t 


(In the last two sums the difference is in the arrangement of terms 
only.) In this manner we have arrived at a formalism that corre- 
sponds exactly to that of Section 43. If we refer the A; terms (in 
Section 43 X; stands for A;) to that coordinate system K of the Hil- 
bert space in which the products A,A; are diagonal, the energy H is 
also diagonal in K and the eigenvalues of H are given by 


> treme + 4) 


The interpretation is this: Whenever we carry out a measurement of 
the field energy, we always find an integer multiple of hv; for any 


214 QUANTUM ELECTRODYNAMICS [Cu. 8 


plane wave; that is, when its energy is measured, light behaves like a 
system of corpuscles of energy fv;. In addition, the theory provides 
an infinite ‘zero point energy” > hv;/2, which should give any 
cavity radiation an infinite mass and must be taken as an indication of 
certain limits which restrict the applicability of the laws of quantum 
mechanics (cf. Chapter 10). 

Simultaneously K is the principal system of the field momentum, 


G = 1/4ne if E« H dp, for we have 


1 dA 

ealime mr  ke 1 
G rr es A dv 

1 dq; ) 

ee a | (4 uy X ge curl u; ) dv (319) 
From the relation 
2 
u=e 4arc ik-r 


it follows that 


2 
curl u = ike x ea) eX? 


Because of the orthogonality of wu in (319), all products vanish except 
those of the form u;u;*, and, as uw X curl u;* = —ik,(4rc”/l*), we 


obtain 
d 
G =i y k; aa =1 } kypi*qe* 
t t 


Upon quantization, this expression becomes 


G= i) PO, = yy nk (Ar + B,) (Ar — B,) 


= ) nk 34(4-A: + BA: — AB, — BB) 
The two terms B,A; and A,B, in the above give zero, because, if oh 
t 


means summation over all ¢ and eC only over the positive ¢, then, since 
t>0 


k_, = —k,, A_; = B;, and B_; = A:, we have 
YK BA: = y k,(BrA; — B_,A-_1) = > k,(B:At bas A,B,) = 0 
t t>0 


t>0 


Sec. 56] QUANTIZATION OF THE FIELD 215 


Furthermore 
Y keBB, = — ) BB, = — Ye Ad = - ) eA 
t t t t 


so that we obtain finally 
E 
G = 3 hk, (a4, aa 2) (320) 
t 


Thus energy and momentum permit a simultaneous exact measure- 
ment. The momentum of a wave propagated in the k direction, the 
frequency of which is », is always found to be hk = hv/c times the 
direction of k. 

Thus, in certain experiments, the transverse part of the electro- 
magnetic field behaves like a system of particles that satisfy the Bose 
statistics. For the correctness of this corpuscular interpretation of 
light, it is essential that the quantities hy and (hv/c)e (taken as energy 
and momentum respectively of a light quantum) transform like £ 
and p. Indeed, according to the theory of relativity, Ep as well as 
hv and (hv/c)e, form a four-vector, so that the relations H = hy and 
p = (hv/c)e are relativistically invariant. 

Nevertheless we must not overlook the truth that the concept of 
light particles explains the facts only within certain limits. All 
that we may infer from the preceding is that light, as far as energy and 
momentum are concerned, cannot be distinguished from a system of 
particles with energies hy and momenta (hv/c)e. In other respects 
there is no equivalence. For example, it proves to be impossible to 
coordinate a certain position to the light quantum, and generally we 
may state that the interpretation of the light quanta as real particles 
would lead to consequences that are positively wrong. For instance, 
we are not permitted to imagine that an atom, in order to absorb a 
light quantum, must literally be hit by it. The only claim of the 
theory is that always, in the interaction between light and matter, 
there is exchanged an energy quantity hy, which must be understood 
in the sense that we are not able to analyze the process by which the 
exchange is effected, because any sharp measurement of the energy 
carried out on the radiation field excludes a simultaneous observation 
of the field quantities E and H. This is evident at once because of 
the non-commutability of the matrix for E and the energy matrix 
(314). The matrix corresponding to 


dA/dt 
E=- ee 


216 QUANTUM ELECTRODYNAMICS [Cu. 8 


is —(1/c) z Pu;, and this is not diagonal in the principal system K 


of the energy. Thus, when the vector representing the state has the 
direction of an axis of K, so that the energy has a distinct value, a 
measurement of E or H may furnish any value. This forms an 
essential condition for a consistent union of the particle and the wave 
picture. The coexistence of these two pictures is possible because any 
experiment in which light has the character of a wave motion makes 
the simultaneous observation of particles impossible and vice versa. 
The field quantities E and H, belonging to the same point of space, 
are also not simultaneously measurable because E and H are given in 
the matrix representation by —(1/c) > Pius and > Q; curl wu; respec- 
t i 
tively, and, because of (315), these matrices are not commutative. 
The commutation relation which applies to E and H could be derived 
easily from (315), but we shall not go into that problem here. 
Finally, note should be made of the point that the Hamiltonian 


H = D rAd, of the pure field contains the quantities A; and A, 


t 
only in the product. According to Section 43, this means that the 
number of light quanta does not change in a field that is not inter- 
acting with charged matter. In fact, light quanta can only be created 
or annihilated by processes in which the quanta are emitted or absorbed 
by charged particles. 

57. Quantization of a System Consisting of Field and Par- 
ticles. Now we must extend the quantization to the electrons con- 
tained in the cubic space considered. In the classical theory their 
Hamiltonian is given by 


2 
» c a) mc? a (ms = a.) (821) 
k 


plus the Coulomb energy > e?/rits which is due to the longitudinal 
idk 
part of the field. In quantum mechanics all n particles form an anti- 
symmetric system the state of which is described by a vector X. 
This vector is referred to a coordinate system K the axes of which 
belong to a certain spatial configuration ©1y12Z1%2¥o%2 * * * , LnYnen OF 
P,P, -++P, of the particles. It must be understood, however, 
that, because of the indistinguishability of the particles, there is no 
individual coordination of the P; terms to the particles, but rather 
that a configuration P\P2 - - - P, only means that one of the particles 


Src. 57] QUANTIZATION OF FIELD AND PARTICLES 217 


is at Py, another at P2, and so on. ‘ In addition to the P; positions 
of the particles, the spin must be taken into account. To fulfill this 
requirement we correlate to any axis of K certain values p4, pe, - 

pn of the spin coordinates. The meaning is that the quantity p toh 
can assume four values only) has the value p; for the particle at P, the 
value p2 for that at P2, and soon. The total state of all the particles 
then can be represented by the function 


X = ¥(PiP2 - + - Pnpip2 * * * pn) (322) 


which, by the square of its magnitude, determines the probability 
that a particle with the spin state p; is found at P,, that of state po at 
P2,and soon. The function is antisymmetric in its indices, changing 
sign when two indices are interchanged. 

According to Dirac (Section 45), in relativistic quantum mechanics 
the energy operator of a single particle for the case ¢ = 0, A = Ay, is 
given by 


3 
H=c y) as| ms _ ‘4, (P) | + mes 
int 


wherein a; and 8 are the matrices defined by (232) and p; is the 
operator —(h/i)(d/dx;). The matrices a; and 6 operate on the spin 
coordinate p, whereas p; is to act on the coordinates of the point P. 


On associating a; to the vector a and p; to p = —(h/z) grad, we can 
write 
H=c E ie “AP)| + me%B (323) 


and so, for the energy operator which is to be applied to (322), that 
is, to the system of all the particles, considering also the Coulomb 
action we obtain 


pr 2 
H = » f [a~, Pr — £ ars | os meta | oo Y= (323') 
i i<k * 


Finally we combine radiation and particles in a total system and 
refer its state X to a coordinate system K the axes of which, in addition 


to certain PiP, - - +: P», and pip2 - * - pn, are associated with certain 
numbers nyn2 + - * of light quanta which make up the radiation. 
Then, instead of (322), we have 

X = ¥(PiP2 > * + Papip2 * * * paring *** ) (324) 


218 QUANTUM ELECTRODYNAMICS [Cu. 8 


and the energy operator becomes 


H = y hy, AiAz 
ie 
e e” 
+ » f [a PE — acre | + mets | + Me (325) 
: ; s tk 


We are particularly interested in that part of H corresponding to the 
interaction between radiation and the particles and expressed by 


pigeon os efa®, Ad(Py)] = — S e(a™, Qyru) 


k kt 
ao bi e AES [a (A, — By uj] (326) 
kt : 


H' is of the first order in A; and B;, and this means that the interaction 
comes about by processes in which a photon is created or annihilated. 
Such processes are also responsible for the retarded interaction between 
the particles themselves. First, however, we shall show that the 
equation 


=o = ae (327) 


according to which the system changes with time, is identical with 
Maxwell’s equations translated into the language of quantum mechan- 
ics. As we have seen in Section 33, equation (327) is equivalent to 
the statement that for any observable of the system the following 
equation holds: 
= = AH —HA (328) 
a dt 
We apply (328) to the field quantity E = —(1/c) » (dq:/dt) uz, 
belonging to an arbitrarily given point P of the field. (The longi- 
tudinal part of E can be neglected since it cannot be quantized and, 


as a result, is commutative with H.) If the matrix —(1/c) Pyu,(P) 
t 


which belongs to E(P) is denoted by E(P), then, because 


H=16 ») (»°Q,:Q: + BP: — ta da, Qyui(Px)] 


t kt 


Src. 57] QUANTIZATION OF FIELD AND PARTICLES 219 


plus parts that commute with E, for EH — HE we obtain 


2 <> 
EY — HE = — y 5, Ps Qv Qvlue + ie {a™, [P.Qv]u(P)u(Pr)} 
ktt' 


tt’ 


For t’ = t, [P:, Qu] gives 

PQQ — GOP. = P.O. - UPd = - *0, 
For t’ = —t, we get 

P.0.0, - QP: = Q(P.0, - GF) = —*a, 


Furthermore [P,Q,] differs from zero only for t’ = —t, so that 


“ rm >) r2Qi — >) [a®, > u(P)u*(Px)] (829) 


The first term on the right is identical with c curl H, where H de- 
notes the matrix belonging to H; for, since curl curl wy = —V?u = 
(v:?/c) us, 


2 
curl H = curl curl A = » “a Que 
t 
In the second term on the right, we use the relation 


Y u(P)ui*(Pr) = Arc? 8(P — Pr) 


t 


so that (829) can be transformed into 


- = c curl H — 4re > ea® a(P — Py) (830) 
k 


which corresponds to Maxwell’s equation for point charges, 


ee ee ee 
c c dt 


as the Dirac matrix vector a replaces v/c. Similarly we get 
H = -—ccurlE (331) 
whereas, from H = ya curl w%,, it follows that 
t 


div H = 0 (332) 


220 QUANTUM ELECTRODYNAMICS [Cu. 8 


The description of the system is completed by the equation 


div E = 4x >» e 3(P — P,) (333) 
k 


which takes the place of div E = 4rp and holds for the longitudinal 

part of the field E;. Since E; is not a matrix but a quantity expressible 

by ordinary numbers, equation (333) is not adaptable to our plan but 

must be taken from somewhere else. E; is a solenoidal field originating 

from the point charges. It produces the term ) (e?/rxz) in H, and 
i< 

therefore it must fulfill equation (333). 

The quantum-mechanical equations, (330) to (333), of electro- 
dynamics are in complete formal agreement with those of the classical 
theory, the only difference being that now the field quantities are inter- 
preted as matrices which satisfy certain commutation relations rather 
than as quantities which can be described by ordinary numbers. 
And so we arrive at equations which for any experiment supply infor- 
mation as to whether light will display undulating or corpuscular 
character. Often it has been maintained that quantum mechanics 
was unable actually to explain the ambiguous nature of light, having 
been able merely to bridge the chasm between the two pictures by a 
purely mathematical formalism. This is a misconception which under- 
estimates the performance of quantum mechanics. Any explanation 
involves the reduction of a phenomenon to certain laws. Quantum 
mechanics represents these laws in the equations which have been 
developed and therefore may claim to explain the facts as well as, let 
us say, ordinary mechanics explains the behavior of macro bodies. 
No theory can offer more than the description of physical facts with 
the aid of a mathematical scheme. 

58. Interaction between Radiation and Matter. All inter- 
action processes between radiation and charged particles are accounted 
for by the part 


H! = —) da, A(Pr)] 
k 


of the Hamiltonian. The problem defined by (325), together with the 
commutation relations, cannot be solved exactly if this part of the 
Hamiltonian is to be taken into account. Therefore we must content 
ourselves with an approximation by considering H’ a small perturba- 
tion and applying the methods of perturbation theory as developed 
in Chapter 4. Without H’ the two parts of which the system is com- 


Src. 58] INTERACTION BETWEEN RADIATION AND MATTER 221 


posed would be independent. Then the radiation would consist of 
light quanta fy,, the numbers nyn2 * * + of which remain constant. 
On the other hand, the stationary states of the particle system would 
be given by the solutions of the wave equation 


— 2 
{e >, (pw) + me*p™) + , £| ¥(Pi1 +--+ Pnbt ++ * pn) 
k ice 
= BEY(P1 +: * Pnpi*** pn) (834) 


On the left-hand side, the expression in the braces is the energy opera- 
tor, operating on the state represented by a function of the positions 
P,P, +++ Pn and the spin coordinates pip2 * * * pn. The square of 
the magnitude of this function determines the probability that, by 
measurement, a particle in the state p; will be found at P;, another in 
state po at Ps, and soon. On the right-hand side, Z denotes an eigen- 
value of the energy. In what follows we shall, for the sake of brevity, 
denote the solutions of (334) by Wa, wo, * * * and the corresponding 
energies by Ea, Zp, --- . Then a certain state of the total system 
can be characterized by a number a indicating the state of the particle 
system, and the numbers nyn2 - - - of the light quanta constituting 
the radiation. Owing to the interaction, the state does not persist, 
but after a given time ¢ there is a certain probability of finding the 
system in another state b, n1’no’ - + + , which has occurred in such a 
way that the particles have assumed another state, and, simultane- 
ously, by emission and absorption processes, the numbers nine ° - * 
of the light quanta have been changed to n1'n2’ +++. According to 
(187), the probability of a transition from a, nyn2 * + * tob,n1’ne’ + * - 
is determined by the matrix element H’aninz...,bni'no’---- This element 
refers to that coordinate system K of the Hilbert space the axes of 
which are correlated to the states a,nyn2 ---:. Because of (326), we 
have 


, 
H anin2-+-bning--- 


a7, > » é mE (A, — Be)nine-.- ning --- [a 2e(Px) lap (335) 


On the right-hand side the matrices A; and B; have indices relative to 
the light quanta only, for A; and B; have nothing to do with the par- 


ticles. On the other hand, a” u,(P,) operates only on the state of the 
particles by changing a function y of P; and p; into another function 
Wy’ of the same variables. Referred to the system K, the operator 


222 QUANTUM ELECTRODYNAMICS [Cux. 8 


becomes a matrix the elements of which are to be designated by the 


numbers of the particle states. To determine an element (a us) a, 
we assume an arbitrary state (Pi - - - Pnpi - * * pn) to be expanded 
in terms of 4 (which form an orthogonal system), that is, 


v=) tae (336) 


The action of the operator (a™ 


vector y’: 


uz) then transforms y into another 


V = [2 u(Pi)W = > t0/be (337) 


a 


the relation between x, and z,’ being given by 
Iq = Ny (a u,)anre (338) 
b 


We now normalize the ¥, terms by means of ‘ We"s = 10.3. Le 


integral is to be understood in the sense that we must integrate over 
the coordinate space of all the particles and consider ~,*y an abbrevia- 
tion for the sum of all the products, Ya*(o1 - * - pn)Wo(o1 * * * pn) 
with p; = 1, 2,3, 4. From (837) it then follows that 


te! = [ Yo*la®u(Pr)WV 
or, if we substitute (336) for y, 


Lq’ = > / Ya*[a up(Pr) Wr 
b 
On comparison with (338), we find that 


[a ws(Px) las = J ve*fa™ u(Pr)Wo (339) 


aw, is to be understood as the scalar product of the vector a and 
the vector u;. The components of u; are ordinary numbers, but those 


of 2 are matrices. The product (a u,) is, therefore, caused to act 
on ¥ by transforming the four quantities ¥(p), (p = 1, 2, 3, 4), 


with the help of (a 1u,) into four other quantities yz’(p). 
The same considerations apply to the matrices A; and B; which occur 
in (335) as applied to X in Section 43. Thus only those elements of 


Sc. 58] INTERACTION BETWEEN RADIATION AND MATTER 223 


A, and B, are different from zero for which n;’ = n; + 1, whereas all 
the other n,’ terms equal the n; terms. We have 


(Az) nt, ne+1 ie (Be) ne, nep-1 =Vn+1 


(As)aunet = (Bina = Vm a 
In these equations (A4) n:,n++1 Means an element 
CAdiahas sddiciakeodipe 

with arbitrary whole numbers nin2 -* - , the other elements to be 


taken in a similar sense. Then it follows from (3835) that the transition 
probability is different from zero only for the passages a, nine °° 
to b, ny'n9’ - - - in which one of the numbers n; increases or decreases 
by unity, that is, n:’ = n; + 1, all other n,’ equaling the n;, since for 
any other transition the elements of A and B vanish. A transition 
nt = Nt +1 corresponds to the emission of a light quantum and is 
due to B,, the elements of which, according to (340), differ from zero 
only for the case n;—> n, + 1. On the other hand, A, is responsible 
for absorption processes ny— m, — 1. Therefore the probability of 
emission or absorption is determined by the elements 


h — 
H' ansbnipl = 1 Naa Vui+1 » | va* [a ur(Pr) Wo 
k 


h = 
PD snibuse~i = —1e Na, ¥™ y, | ba*[a™ u(Px) Wo 
k 


The elements are of the first order in the charge e and correspond to 
transitions in which only one of the quantum numbers nyn2 °° * is 
changed. They are, therefore, the fundamental quantities in the 
theory of all those processes in which a single photon is created or 
annihilated, a condition which is fulfilled, for example, if an atom emits 
or absorbs a photon. For the most part, however, in the processes 
that are due to the interaction between radiation and matter, two or 
more light quanta are involved. For instance, a scattering process 
changes the radiation in such a way that a light quantum vanishes to 
make room for another one that is created. For the treatment of 
such processes the perturbation theory of the first order does not 
suffice, as all the matrix elements of H’ which correspond to the 
transitions vanish. Then we must resort to the second or higher 
approximations by effecting the transition, according to Section 37, 


(341) 


224 QUANTUM ELECTRODYNAMICS [Cu. 8 


through one or more intermediate states. The probability for a 
transition, for example, of the kind ann, — bn; + 1, nz — 1 is then 
[ef. (186)], determined by the quantity 


, 1 
H aninz,cni+1,n, eni+1,nk,bni+1,n-—1 
Pee tant lt ts Se tae Lo a ae 
z Banim i I PRP TaN ee 


that is, by a matrix element of the second order in e. Thus we may 
classify the processes resulting from the interaction between radiation 
and matter, according to the number of light quanta involved, into 
processes of the first and second order and so on, and characterize 
them by the corresponding power of e. 

According to Section 37 the transition probability has a finite value 
only for transitions in which the energy of the system is conserved; 
this holds for processes of any order. It should be noticed, however, 
that the energy of the virtual intermediate states may have any value. 
The momentum is conserved if the particle on which the radiation is 
acting is free. For then the function y in (341) is given by y= 


a(p)e?"* With u = e; V4ac?/l’ e* for a single particle, if we 


designate the product (ae,) by ae, we obtain 


: h|4rc? ; 
Hanioniti = te Ny ay, Ve VM tI Ga, aca) / &oratk ates 
t 


and a corresponding expression for H’an;on;-1. The integral has a 
value different from zero only for pa — p, = Ak;, so that the only 
transitions possible are those in which the momentum of the absorbed 
or emitted photon is equal to the change in the momentum of the 
particle. It must be emphasized, however, that the emission and 
absorption of a light quantum by a free particle is possible only for 
the transition into a virtual intermediate state. The transition into 
the final state would require the conservation not only of the momen- 
tum but also of the final energy, and this condition cannot be fulfilled 
in a consistent way. In the case of an emission, this is seen at once 
when we refer the particle to that coordinate system in which it is at 
rest. The emission of a quantum would then demand that the energy 
of the particle diminish by hv, which, without a loss of mass, is not 
possible. On the other hand, an absorption process would require 
that the energy increase by mc?(1/V1 — B®) = hv and that the 


momentum increase by mv/V 1 — 6? = hv/c, and these two conditions 
are incompatible with each other. But an electron can emit or absorb 


Src. 59] EMISSION AND ABSORPTION OF LIGHT 225 


a photon by passing into an intermediate state, because then the energy 
need not be conserved. Of course, an electron that is bound to an 
atom can emit or absorb a light quantum just as well, since it can 
transfer part of its momentum to the atom. 

59. Emission and Absorption of a Light Quantum by an 
Atom. We take as an example of a process of the first order, the 
emission or absorption of a light quantum by an atom, assuming, for 
the sake of simplicity, an atom with only one electron. In the calcula- 
tion, without a noticeable error, we shall be satisfied with a non-rela- 


tivistic approximation by substituting v/c for a in the expression for 
the interaction H’, sothat 


; e te h 
H' ==— > bras) aa oe i NES (At — Bi) (vu) 


where v is the velocity of the electron and the u terms are the vector 
functions e; V 41c?/l® e**. Then, for the matrix element H’an:bni41) 
we obtain 


] Mw ie 
FT ansbnett +! = Wi Ny th (vuUt)ab (342) 
and thus 
e hh 
|H’enndnvial® = C2 2 (me + 1)| (wus) as]? (342’) 


where n; is the number of light quanta contained in the space. In 
order to evaluate we must, according to (339), form the expression 


/ Va*(vuz)vo, in which pq and y» denote the wave function of the atom 


in the states a and b. (vw) is the operator (v + e:) V 4xc?/l? oe. 
The wavelength 2x/|k| of the light which is emitted or absorbed by the 
atom is always greater than the extension of the atom, so that gent 
may be considered constant within the domain in which ¥, and y are 
noticeably different from zero: in |H a ta e*** and e *** therefore give 
1. Thus, if @ denotes the angle between v and the direction e; of 
propagation, we have 
2 
|cvu)aal® = [vaal® cos? @ 25 

In the transition a — b, the energy of the atom decreases by Ea — E,, 
and a light quantum hy; is created simultaneously. The conservation 
of energy demands therefore that Zz — E, = hy. Now if the matrix 


226 QUANTUM ELECTRODYNAMICS [Cu. 8 


X = ||x;x|| of an observable is referred to the principal system of the 
energy, then, according to (164), 


(Ey — Ea)tap 


Then, when we apply this relation to the xyz coordinates of the elec- 
tron, since da/dt = vz we get 


(vz)ab = 4 (Ey — Ea)tap = iv¢Lap 


and hence 
|vas|? ae ve?{| a0]? ar lyao|” + |zas| 7} 


so that |H’|? is given by 


Aor 2 
= cos? 6{|xa5|? + [yas]? + \za0|?} 


Now the characteristic frequencies »; of a cavity radiation correspond 
to a very dense line spectrum, and thus the considerations of Section 
37 apply. We have seen that, in this case, it is not important to 
evaluate the probability with which the creation of a light quantum of 
definite frequency »; is to be expected, but that it would be better to 
consider all quanta with an energy between FE and E + dE and a 
direction of propagation that lies within an angle d®. According to 
(188), a light quantum that satisfies these conditions is created in unit 
time with the probability 


dW = la? = pao 


Le os bnept|” = fd (ne “- 1) »;” 
i ce 2; 


in which p dQ dE denotes the number of the radiation oscillators with 
a frequency between L/h and (EZ + dE)/h and a direction within dQ. 
To determine this number, we use the relation (300), according to 


which 
p= V ny” + ne? + 03” 


If we imagine that in a three-dimensional space all the points P are 
marked, the coordinates of which are integer numbers nin2n3, then 
any P corresponds to a frequency v = 2mcr/l, where r denotes the 
distance of P from.the origin of coordinates. The number to be 
determined is, therefore, given by the number of the points P within 
the distance r = (1/2rc)v and r + dr = (l/2mc)(v + dv) lying within 
the cone dQ. rand r+ dr define a spherical shell of volume 4rr? dr, 


Sxc. 60] DIVERGENCES IN THE HIGHER APPROXIMATIONS 227 


and therefore the volume enclosed by dQ is 4xr* dr(dQ/4xr). This 
element of volume is identical with the number of the enclosed points 
which have integer coordinates, so that p dQ dE is given by 


i y ( i y y2 
= 2\|—) dvdo = 2(—]} — 
pdQdE (4 v’ dvd or paQ = 2 and ae dQ (348) 
The factor 2 is necessary because, for any »;, there are two directions of 
polarization. For dW we now obtain 


De PNR GPs A, q 4xc* —— 4 
dW = 7 2 (+) ; dQ Dy, (iz + 1)% 7a 008 8{| as] 
: ei lyas|? = i |zas|?} 
= oe ve? dQ(m, + 1) cos? O{|aras|? + |yas]? + |zas|?} (344) 
hre 


where 7i; is the average number of light quanta per frequency and 
direction of polarization. dW is composed of two parts: one part is 
proportional to 7%; and corresponds to an emission induced by the 
radiation; the other part, which is independent of 7%;, represents a 
spontaneous emission. The total spontaneous emission is found from 
(344) by integrating over dQ. For this purpose we substitute the 
angle 6’ between v and k for the angle @ between vande. It is verified 
readily with the help of a figure that cos? 6 = 14 sin? 6’, so that we 
finally obtain for the probability of a spontaneous emission 


e 
= hrc? - * {aa + |yas|? + |zas|? ah a 6’ dé’ de 


= = Par ve*{| 200]? a yas]? + |zas|? } 
If we take into account the fact that the notation » now represents 
the frequency times 27, this is in agreement with (105). 

In order to evaluate the probability of an absorption we must start 
from the matrix element 


FH’ an:oni—1 = — — ae Vint (vue) ab 


which differs from (342) in that it contains Vn instead of Vm; + 1. 
This means that there is no spontaneous but only an induced absorp- 
tion, a fact that agrees with experience. 

60. The Divergences Occurring in the Higher Approxima- 
tions. Here we are to consider the influence the interaction term 


228 QUANTUM ELECTRODYNAMICS [Cu. 8 


H’ has on the energy levels of the system consisting of radiation and 
particles. Without H’, these levels would be given by E = Ey + 


: nhv;, where Eo denotes the energy belonging to a stationary state 


t 

of the particle system. Because of H’, EF is to be corrected by additive 
terms H!, ZH”, - - - which are to be calculated according to the method 
of Section 35. As shown there, the correction term E! of the first 
order for the kth level is given by the diagonal term Hj,’ of the per- 
turbation matrix, which in our case is H’aninz...anine--.- AS 18 seen 
from (326), this term is zero since all the diagonal elements of A; and 
B; vanish. The interaction of the first order leaves the energy levels 
unchanged. The term £’, on the contrary, according to (177) is 
given by 


E? = Hem Hmx Hes FH onina---bni'ny --- baying’... anina--- 
Ss Ey — Em E—-E' 


bni/ns’ --- 
(345) 


and is not zero. E* can be evaluated on the basis of certain simplifying 
assumptions. We consider the case of one electron only and assume 
that all the numbers n; of the light quanta are zero. The particle 
may be at rest, and therefore its energy is mc”. Then in the summation 
only such of the elements H’aning..-bnyne’--- appear for which one of 
the n’ has the value unity, all the other n,’, and the n; as well, being 
zero. The ground state of the particle is signified by a, and in what 
follows the subscript 0 refers to that state. Some other state of posi- 
tive or negative energy is denoted by b. Making use of (341), and 
because n; = 0, we have, for E?, 


aa Thos Vo*(aus)vs | vo*(au)vo 
E? = 
346 
— (By + he) brie 
The denominator accounts for the fact that the total energy of the 
system in the initial state is the rest energy mc’ of the electron, whereas 
the final energy is made up of the energy E, of the particle plus that 
of the light quantum hy;. As the evaluation of the sum is somewhat 
troublesome, t we shall be content to give the result 


he 1 
E? aa Cee ms ey 
mi’ r Vt 


{ W. Heitler, Theory of Radiation, Oxford University Press, 1948. 


Sec. 60] DIVERGENCES IN THE HIGHER APPROXIMATIONS 229 


in which the summation extends over all the eigenfrequencies of the 
radiation. As the number of frequencies between v and v + dy is, 
according to (343), given by 


(4) 
8x one vy” dp 
then EZ? can also be written in the form 


2 ry 
E? = +3; yay (347) 


men 


The integral being divergent, the particle, in its interaction with the 
light quanta, undergoes an infinite displacement of its energy level, 
the consequence being that, besides an infinite Coulomb energy due 
to the longitudinal field, it would possess an infinite transverse self- 
energy as well. Because h occurs in (347), we see that the transverse 
self-energy is a quantum effect which diverges not only in the second 
but in all higher approximations as well. Evidently this divergence is 
due to the short waves contained in the radiation. Generally the 
application of the quantum-mechanical perturbation theory may be 
expected to furnish infinite correction terms of the energy provided, 
that there is an infinity of states n for which the elements Hm,’ of 
the perturbation matrix differ from zero, thus making possible transi- 
tions from m into all these states. The transverse self-energy of the 
electron being considered here is only a typical example of divergences 
of this kind. Another example is the infinite self-energy of a light 
quantum caused by the interaction of the quantum with electrons of 
negative energy. In this example the divergence results from the 
infinity of intermediate states into which the system of light quantum 
and electrons can pass by the creation of pairs. 

In summary we may say that the application of quantum mechanics 
to the electromagnetic field leads to a theory that basically is, without 
doubt, correct, inasmuch as it embraces a great part of the experi- 
mental facts. There are certain consequences, however, which contra- 
dict experience in that some quantities which are unquestionably finite 
(such as the self-energy of the electron) assume infinite values. Not 
all these troublesome divergences, which among other things make a 
consistent relativistic solution of the many-body problem impossible, 
originate in quantum mechanics. The infinite Coulomb self-energy 
of a point charge is present in the classical theory, which attempts to 
overcome the difficulty by assigning to the electron an extension of the 
order of magnitude rp = e”/mc”. This assumption cannot, however, 
be reconciled with the principle of relativity. According to this 


230 QUANTUM ELECTRODYNAMICS [Cu. 8 


theory any extended body can be deformed, so that an extended elec- 
tron had to be pictured as a thing consisting of parts that move relative 
to each other, an idea evidently incompatible with the concept of an 
elementary particle. On principle, therefore, it is impossible to devise 
a relativistic theory in which the particle is treated as an extended 
body without thereby losing the character of an elementary particle. 
The result of such an attempt would invariably be a continuum 
theory in which the concept of an indivisible particle has no meaning. 
Already, for this reason, the classical theory was forced to consider the 
electron a point charge, with the consequence that an infinite self- 
energy had to be accepted in the bargain. Since, however, this energy 
originates from the longitudinal field which cannot be quantized, 
quantum mechanics could not avoid taking over this divergence as it 
stood. But quantum mechanics presents added divergences which 
are not, at least not immediately, connected with the point character 
of the particles but arise from the necessity of taking into account 
wavelengths of any shortness, that is, of considering fields with an 
infinite number of degrees of freedom. It is because of this circum- 
stance that the theory has to attribute to a cavity radiation an infinite 
zero point energy caused by the short wavelengths. We have seen 
also that the evaluation of the transverse self-energy of a charged 
particle provides infinite correction terms of the second and higher 
orders. This situation suggests the conclusion that there must be a 
certain limit to the applicability of quantum mechanics by which the 
effectiveness of the short waves is restricted. The limitation must be 
due to the efficiency of a fundamental constant which has been dis- 
regarded up to now and the theory of which will be developed in 
Chapter 10. 
PROBLEMS 


1. From (315) derive the commutation relations for E and H. 

2. Describe the manner in which the ambiguous nature of light has to be under- 
stood according to quantum mechanics. 

3. Why is it impossible to quantize the Coulomb energy? 

4, From what arguments can it be inferred that the rest mass of the photon is 
zero? Compare the equations of Maxwell with those of Proca. In what way 
would light have to behave if the rest mass of the photon differed from zero? 

5. Discuss the origin of the divergences occurring in quantum electrodynamics. 


9 
WAVE FIELDS 


AND NUCLEAR MATTER 


The meson is the most important particle with which modern physics 
is concerned. It may be taken for granted that in the cosmic radia- 
tion there are positive and negative particles with a mass about two 
hundred times that of the electron mass. However, more recent 
investigations make it probable that there are mesons of different 
masses which seem to be transmuted into one another by some sort of 
radioactive decay. From experimental evidence, the normal meson 
possesses only a very short mean lifetime, after which it disintegrates 
into an electron and a neutrino. One of the reasons why such great 
importance is attached to this particle is that it seems to mediate the 
forces the protons and neutrons of an atomic nucleus exert on one 
another. The generally accepted view is that these forces are brought 
into play by an interchange of mesons between the nuclear particles. 
In a manner similar to that in which the electrons act on one another 
by a field, the nuclear particles are assumed to be connected by a field 
which satisfies a second-order wave equation. The field originates 
from the nuclear particles and manifests itself in the appearance of 
mesons which, when the field is quantized, appear also in the mathe- 
matical formalism. In this way the interaction between two nuclear 
particles appears in the particle picture as an exchange of mesons. 
Let us suppose, for example, that a proton is face to face with a neutron. 
The proton may emit a positive meson which is subsequently absorbed 
by the neutron, and as a result the proton is changed into a neutron 
and vice versa. This means that momentum is transferred from one 
particle to the other and therefore a force is acting between the two 
particles. 

Up to the present, attempts to develop this idea into a consistent 
theory have failed, being handicapped by divergences which cannot 
be removed satisfactorily without a supplementary principle. There- 
fore, in what follows, we shall be concerned only with the foundations 
of the theory, investigating the possibilities given for the interaction 

231 


232 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


of the wave fields considered in the two preceding chapters, and nuclear 
matter. Such far-reaching restrictions are imposed on these pos- 
sibilities by the theory of relativity that, in seeking the interaction 
terms, we cannot go astray. On the other hand, the application of 
the theory to the problems of nuclear forces and cosmic radiation has 
as yet produced no definite results. We shall, therefore, merely out- 
line these problems without going into details. 

61. The Lagrangian of the Interaction.} The fields with which 
we have dealt in the preceding chapters do not exist in vacuo but issue 
from material particles on which they are reacting, so that the fields 
and the corresponding particles form closed systems. We base the 
following considerations on the assumption that it is the protons and 
neutrons of an atomic nucleus which, because of a certain ‘‘charge,”’ 
are able to radiate and absorb the field. (The concept “charge” 
is used here in a general sense and has nothing to do with an electric 
charge.) It is convenient to describe this property by means of the 
Lagrangian which corresponds to the system of field and particles, 
and which must contain a term representing the interaction between 
field and matter. By its nature, the Lagrangian is an invariant scalar 
function, and therefore, in determining the interaction, we are con- 
fronted with the problem of forming an invariant out of the field 
quantities y and x, introduced in Chapter 7, and at the same time from 
the wave function © of a nuclear particle. As the spin of the nuclear 
particles is 44, we have to apply the Dirac equation to them. Con- 
sequently & is identical with the quantity consisting of four components 
introduced in Section 45 in order to treat particles having the spin 4. 
It can be shown that with the help of the matrices defined by (232): 


me Ure 7 ip ee 0 
«=|? “I e-|5 i 


n={{ nA n-(|°, “s nolo ger 1=[4 A 


The following tensor functions can be formed from ®: 


Scalar: Wo = &*Bb 
Four-vector: wi = D*ab Ws = 1d*h 
Antisymmetric tensor of second order: (348) 


Wik = B*Baja,P Wak = 1€*BapP 


+ Cf. N. Kemmer, Proc. Roy. Soc. London, A166, 127 (1938). 


Suc. 61} THE LAGRANGIAN OF THE INTERACTION 233 
Pseudovector: Wag = P*ajayP W123 = —tP*ayaqa3P 
Pseudoscalar: W34234 = B*Bayaga3P 


The reader is reminded that the 24 is defined as ict and that the com- 
ponents of the tensors must be understood in accordance with this 
definition, the consequence being that there is no difference between a 
covariant and contravariant tensor. In the products above, ® is to be 
treated as a matrix with its components in the first vertical column, 
and ®* contains the components in the first horizontal row (cf. Section 
23). When the coordinate system is changed, the products transform 
according to their notation. All the quantities in (348) are, of course, 
to be considered as functions of z, y, 2, and ¢. 

Let us now, on the other hand, consider the fields at our disposal. 
Each of them can be described by two functions, each of the functions 
representing a scalar, vector, or antisymmetric tensor. Throughout 
Chapter 7 we denoted the tensor of the lower order by y and the other 
by x according to the following scheme: 


Scalar field: Scalar y and four-vector xa (Section 47) 
Vector field: Four-vector py. and tensor xag (Section 49) 
Pseudovector field: (349) 


Tensor Wag and four-vector xasy (Section 50) 


Pseudosealar field: 
Pseudovector Pog, and pseudoscalar x73 (Section 48) 


The problem now is to unite the quantities of (348) and (349) in 
such a way as to form invariants. If we make the assumption that 
the interaction depends only on the functions y and x but not on their 
derivatives, the combination can be effected in this way only. This is 
the simplest and therefore the most plausible assumption on which 
the theory can be founded. Then the invariance can be achieved 
only by multiplying every quantity of (349) by the corresponding 
quantity of (348). For example, we must multiply Was by was* and 
summate over all the products Pagwag*. If we provide the products 
with factors, the interaction is described by 


Scalar field: —x«(giwoy* + fiwexa*) + conjugate complex 
Vector field: 


—K(g2Waba* + fo $WasXas*) + conjugate complex 


234 WAVE FIELDS AND NUCLEAR MATTER [Cx. 9 
Pseudovector field: (350) 


—K(93 FWapbas* + f3 FWasyXasy*) + conjugate complex 
Pseudoscalar field: 


—K(94 FWapWagy* + f4 FrWasysXapys*) + conjugate complex 


According to our convention, the expressions are to be summed up 
from 1 to 4 over any index that occurs twice. The factors 4, 1%, and 
Yq are inserted because the index convention sometimes provides the 
same product several times. As an example of this, from wagWesy* 
we obtain the product wi23¥123* 3! times. The factors mentioned 
above remove this abundance, so that the products appear in the 
expressions only once. The factor —x is chosen for reasons of con- 
venience. The numbers g; and f; express the strength of the inter- 
action. For the present we shall treat them as ordinary numbers, but 
it will turn out that for charged mesons they have the character of two- 
row matrices. 

The expressions of (350) refer to the case where there is only one 
nuclear particle. If the field interacts with several particles, we have 
to coordinate a wave function &* to each one of them and substitute 


the sums wy = D atesto,w, = hi bq Fb' and soonforwo,wi, °°: . 
P P 


62. Scalar and Pseudoscalar Fields. First we shall apply the 
theory to a scalar field which is in interaction with one nuclear particle. 
According to (250) and (350), the Lagrangian is then given by 


L = —K(W*) + xa*xa) 
— k(gwo* + fwexa*) + conjugate complex (351) 


In order to obtain from this the Hamiltonian H which is required for 
the evaluation of the transition probabilities, we must evaluate first 
the momenta + = dL/d (dy/dt) and +* = aL/d (dy*/dt) associated 
with y and y*. Since dy/dt = ixcx4, we find that 


1 aL 1 1 
cee 1 eee Se 
4KC OX4 ac ay ict és 
(351’) 
1 1 
ee 5 Se wines 
. ee 
Hence 

d 1 

sad = ixcx, = xe” (x + a) 

dt ic 


Sec. 62] SCALAR AND PSEUDOSCALAR FIELDS 235 


and thus for H = (dy/dt)r + (dy*/dt)x* — L we obtain 
A = xe? (+ + a) 3 + xc? (« + + sue) * + pty + Kxe*x: 
— xe? (« i i + srw.) (x S * fo.) + kgwop* + Kfwix:* 
— kficws (= bE = fret) + conjugate complex 


Where an index 7 occurs twice in this expression it means a summation 
from 1 to 3 only, and the added conjugate complex applies only to the 
last three terms. The expression reduces to 


H = x(c'a*a + y*y + xi*x:) 
+ x(gwoy* + fwix:* — icfwar) + conjugate complex (352) 


and represents the energy density of the field. The first terms in 
parentheses belong to the pure field, whereas the remaining ones give 
the energy due to the interaction of the field with nuclear matter. If 
we take into account that x = 1/« grad y, then, because of (348), the 
interaction energy H’ becomes 


H’ =x , dy &* fn +5| om ob “ (a grad || ® 
+ conjugate complex (353) 


The reader is advised to keep carefully in mind how a product of the 
kind 6*[--++-- ]@ is to be read. The bracket contains a four-row 
matrix (the term fer is to be thought of as multiplied by the unit 
matrix) which operates on the vector ®, after which the new vector is 


to be multiplied by ®*. The operators P and 8 belong to the nuclear 
particles. If the velocity v of this particle is small compared to c, we 


may substitute v for eh that, for a particle at rest, the operator 


a grad y* drops out. Furthermore we may set 6 = 1 for a particle 
at rest, so that, on the assumption f = 0, (353) becomes 


H! = x f dv &*|gy* + g*V)® = Kigv*(P) + g*¥(P)] (854) 


where ¥(P) denotes the value of the function y at the point P occupied 
by the particle. For the transformation, attention is given to the fact 
that the probability density &*& differs from zero only in the immediate 
neighborhood of P, so that *@ may be set equal to 6(P’ — P). 


236 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


In the particle picture the interaction is looked upon as the emission 
or absorption of a scalar meson by the nuclear particle, this being 
analogous to the emission or absorption of a photon by an electron. 
To formalize this conception it is necessary to quantize the field by 
expanding y into 


ik, 


1 
- 75 2, * 


and transcribing the q; coefficients into the matrices of (280). This 
gives 


i = 
A tee ee eee ee oe Re 
Nas) veered By)e g*(A; — B,e***] (355) 


H’ is again referred to the coordinate system K which was defined in 
Section 52, wherein the products A;A; and B,B; are diagonal. In 
the case where y is complex, that is, for a charged field, the axes of K 
are to be designated by two sequences of numbers nitnq* - - - 
Ny Ne ** * so that we have 


A ntingtot fad Vnit +e A or ete = Vanit 


+ 
B® = neq ‘ad Vn = 1 B@ neni nae fas 


all other elements being zero. In these expressions it is to be under- 
stood that all the indices n;* and n;~ which are not included have 
arbitrary values. However, any n;’ is equal to n;, and this means 
that for H’ only the elements 


(356) 


BD’ ni *na*- “5 nit nat es 
Ny~No +++ |My Ne ++ 
differ from zero, for which all except one of the n,;*’ or n;~ differs from 
nit or ne by +1. A transition n;— n; +1, which, according to 
(356), can be effected only by one of the A; or B;, means the emission of 
a meson, whereas the transition n; > n; — 1 due to A; or B; indicates 
absorption. Therefore, of the matrices in (355), A; and B; are respon- 
sible for the emission and A; and B; for the absorption of a positive 
or negative meson. In the case of a real field, the mesons are neutral, 
and then, according to (284), we have to substitute A_; and A_, for 
B;and B_;, making it unnecessary to distinguish between n,;* and n;~, 


Sxc. 62] SCALAR AND PSEUDOSCALAR FIELDS 237 


According to (184), the elements H»,,’ of H’ determine the probabil- 
ity of a transition »—> m. The emission of a positive meson having 


the energy ¢; = c V mo’c? + k;*h? has a probability which must be 
evaluated from 


ch 1 , ‘ 
Dl ngt nett = —ig*k >) a A nit net et*iFo 
20 V1 + 2/0? 


3 
—ig*h — V nit + 1 etki 
€; 


In this expression ro denotes the radius vector from the origin to the 
position of the nuclear particle. Similarly the elements of the other 
possible processes are given by 


3 
H' n-n-41 = — igh <a ig a ge 
&% 


3 
A ngt,nit—1 = igh a V nit eikiFo (857) 


3 
ORL) eye 
|: Le = ig*h — Ni eKiFo 
21%; 


In the case of neutral mesons, (355) must be multiplied by 1/ /2 (cf. 
Section 52), and, since A_; = B; and A; = B_j;, the sum reduces to 


chi 1 a ‘ 
H' = teal ee ) eps pitty (Tp rdlyeT? 
B25 Ti hte @ g*)( i) 
and for H’»;,n;41 and H’n;n;—-1 we obtain 


’ * Cc. 3 
HD ninipl = fie Vr Vini + 1Le™Fo 


* 3 
F' nins—1 = poohd” all Vn; ero 
2 l 

The theory just developed, according to which the interaction 
between a nuclear particle and a meson field can be interpreted in 
terms of emission and absorption processes, depends on an essential 
condition when the meson field is charged. The processes considered 
are possible only if the protons and neutrons are capable of transmuting 
themselves into each other. This is essential since a proton, on 


238 WAVE FIELDS AND NUCLEAR MATTER (Cu. 9 


emitting a positive meson, loses its charge and becomes a neutron 
and, conversely, a neutron, on emitting a negative meson, becomes a 
proton. Thus we are forced to consider proton and neutron as two 
transmutable modifications of the same particle, the nucleon. Therefore, 
in order to characterize the state of a nucleon, we require, in addition 
to xyz and the spin coordinate p, a fifth quantity + which by arbitrary 
convention has the value zero for the neutron and unity for the proton. 
This means a corresponding extension of the coordinate system K of 
the Hilbert space. For a description of a system of one nucleon and 
the field, K would be sufficient without 7 provided that the axes were 
specified by: (i) two sequences of numbers nytng* + + + ny ng + °° 
which refer to the field and indicate a state in which, for any 7, there 
are n;* positive and n;— negative mesons of energy ¢;; (ii) z, y, z, and p 
values which define the position and spin state of the particle. Because 
of 7 any of these axes must be taken twice, value r = 0 being 
assigned to one of them and 7 = 1 to the other. If the vector X, 
which represents the state, has the direction of an axis r = 0, the 
particle is a neutron; in the other case it isa proton. This means that, 
for the interaction, H’ must contain a matrix which intermediates 
between the axest = Oandr = 1. This matrix is associated with the 
factor g by setting 


o=olo o| oes | (358) 


wherein g and g* on the right-hand sides now denote ordinary numbers. 
The matrices operate on 7 only, and it is to be understood that the 
first row is for r = 0 and the second row fort = 1. The matrix g 
has only one element go1, which differs from zero and permits a transi- 
tion, neutron — proton, whereas all the other transitions, P — P, 
P—N,N-—-N, are blocked by g. Ina corresponding way g* permits 
the transition P— N, all others being blocked. The expression 
(357) for H’ can now be given the following interpretation: H’ n,+,n+41, 
which is associated with the emission of a positive meson, contains g*; 
this means that the emission is possible only when there is a simul- 
taneous transmutation of a proton into a neutron. On the other 
hand, H’n,;-,n;-41 corresponds to the emission of a negative meson and 
contains g, thereby requiring the transmutation of a neutron into a 
proton. 

The matrices of (358) can be represented formally by using the Dirac 
matrices y; and v2 as defined by (231), namely, 


na (e a) e=Cy 5) (358 


Sxc. 62] SCALAR AND PSEUDOSCALAR FIELDS | 239 


It follows from this that 
I a = (v1 — ty2) [! ef = $(11 + ty2) 


and thus g and g* take the forms 


g2(¥1 — ty2) g*3(v1 + tye) 


When 7; and 72 are applied in this sense, they have nothing to do with 
the spin of the nucleon but operate only on the charge coordinate r 
which, because of its analogy to the spin, is often called “isotopic spin.” 

According to (350), in the case of a pseudoscalar field the interaction 
is given by 


L! = —x(ghwapyasy* + farxwasysXapys*) + conjugate complex 


If we denote the pseudovectors wag, and Wag, by wa’ and xq’ respectively 
and the pseudoscalars ways and Xea7s by wo’ and y’, then 


L’ = —x(gwe'xa’'* + fwo'p'*) + conjugate complex 


A comparison with (351) shows that all we must do is to substitute the 
pseudoscalar yo’ and y for wo and y, and the pseudovectors we’ and 
Xa’ = (1/x)(d~’/dxq) for the vectors w, and xz. Then for H’ we obtain 


H’ = x(fwo'p’* + gwi'xi’* — icgws'n’) + conjugate complex 


wherein, according to (351’), 


Taking into account that 
Wo = B*Bajacas® wy = &*a;a,P wa’ = —1O* ayaa3h 


and since ta2a3 = oj, etc., we obtain for H’ 
to hea i dv &* {Yearase. "# _ g | casasaa + : fo: grad v|| ® 
K 


+ conjugate complex (359) 


The terms with the factor aja2a3 may be set equal to zero in the case 
of a particle at rest, reducing the above to 


H' = ~i [ dv &*[g(c grad y’*) + g*(c grad y)]® 
(360) 


—ig(s grad y’*)y — ig*(o grad W’)o 


240 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


The index 0 signifies that the gradient is to be taken at the point 
occupied by the particle. 

On comparing (360) with (354), we see that the scalar and pseudo- 
scalar fields, which cannot be distinguished in vacuo, undergo quite 
different interactions with nucleons. Whereas H’ depends only on 
the density of the nuclear matter in the case of a scalar field, there is 


the added dependence on the spin o in the case of the pseudoscalar 
field. This is due to the fact that, in order to arrive at an 
invariant Lagrangian, we have to combine the field with different 
terms. 

63. Vector and Pseudovector Fields. If only the part dealing 
with the particles is disregarded, the Lagrangian for a vector field is, 
according to (263) and (350), given by 


L= —K(Pa*Wa + 3X08 *Xa8) 
— K(gweva* + $fwasxas*) + conjugate complex 


1/foa OWe a 
where xa is defined by — (2 = ete From L we get for 7; 
Kk \O2e Oxg 
od Ob ban wurden zesy * 
ty ean ax 4: 77, ic (x4i oe fFwai ) (361) 
1 
oe = = ic (xai t+ fuss) 74 = 7y* =0 
Because of the relation 
a _ 1a 

oa ick dt K 02; 
we have 

Os avs 

rie IcKxX4; + tc an; 
so that the relation 

_ ahi ay* . 7 

ud te Ou Shen ii 
becomes 

i) a _fa ay,* . 
H = icx(xasmi + xai*mi*) + tc sts mit vA x) —L 
Ox; Ox; 


In the expression H = / Hi dv the terms with (d~4/dx,)r; and (d~4*/ 
Ox;)r;* may be replaced by —wWa4(dn;/dx;) and —y;*(dr,;*/dx;), a 


Sxc. 63] VECTOR AND PSEUDOVECTOR FIELDS 241 


change which occurs when we integrate by parts. We then obtain 
for H 


A = ick(xaems + xai*ei*) — ic(Wa div x + yu" div x*) 


+ biti + whats + : Xie *xXse 1 KX48*Xas 


+ xgw;* + xgwapa* + 5 ftinxin® 
+ xfwaixai* + conjugate complex (362) 


The summation over 7 and k is from 1 to 3 only. 

In (862), Y4 but not 4 occurs. If the Hamiltonian contains a 
coordinate q; the momentum p; of which is zero, we may, by way of 
definition, substitute for g, an arbitrary function f(q;) of the other q; 
whereby H is changed into an expression from which q; is eliminated. 
All the g; coordinates are then given by the canonical equations as 
functions of time, and by this the coordinate q; is also determined in its 
dependence on time. In our case we have yw, instead of q;, and it can 
be defined now as an extension of (258) by * 


—_ 1 0x4: 


K O02; 


— gw, (363) 


If we substitute this value of ¥4 in the equation for H, the equation 
becomes one containing only ¥1, Ye, ¥3 = ¥ together with the corre- 


" 


sponding momenta 7, 72, 73 = T. 

In order to arrive at the interaction, we must now select from (362) 
those terms which contain the factor g or f. At the same time we use 
(363) instead of y4 and put for x4; the quantity —icr;* — fw4;. We 
then have 


HY = ~icx( foams + ftwaias*) + tc(gws div 7 + g*ws* div 7*) 
+ ic(gws div x + g*w,* div 7*) + wic(fwam; + f*was*r;*) 
+ kgwyi* — icgws div x + 5 fax — Kiefwar; 


+ conjugate complex 


: > k : 
= icgw, div x + xgwyi* + 9 fwikxin” — Kiefwagr; 


+ conjugate complex 


242 WAVE FIELDS AND NUCLEAR MATTER (Cu, 9 


and, since H’ = / A’ dv, and taking the expressions of (348) for 
w; and w;4, we get for H’ 


a 


it = if dy &* {ican — ¢diva] 
+ fxp E : is curl *) + o(ea) || ® + conjugate complex 
For a nucleon at rest the above equation reduces to 


H’=- | dv &*[gc div ry + fixB(o curl ¥*)] + conjugate complex 


—gc(div He - fi(o curl v*)o + conjugate complex (364) 


The first part determines the interaction due to longitudinal mesons, 
and the second part is caused by transverse mesons. The index 0 


again indicates that div x and curl y* must be taken at the point 
occupied by the nucleon. In the same manner as in the preceding 
section, the probabilities of an emission or absorption of a meson can 


be evaluated from H’. Todoso we expand ce and ees into 


=e 1 , =. i rs 
hi ih p3 que" ot * a ., pizeije 


a 


and transcribe q;; and p,; into matrices Q;; and P;; which, according to 
(291), we define by 
k;? 
c (« + - is) 


k? 
k+ ral — $51) 


(Ai; — B,) 


(Ai; + Bi) 


The A;; and B;; terms are required in order to satisfy the relations of 
(356) (a suitable coordinate system K having been chosen). Then 
the A;; and B,; pertain to emission and the A,; and B,; to absorption, 


Src. 63] VECTOR AND PSEUDOVECTOR FIELDS 243 


so that, for example, the element H’n,.+,n;,:+41 which determines the 
probability of the emission of a positive meson (j = 1) is, with the 
help of (364), given by 


4 
[x *| peste) 
H'ngtnatp = —g*c a var mit + 1 div (ene**) 9 
o(. + me) 
K 


Since, for a longitudinal meson, e;; has the direction of k;, we have 
div (ee ™") = —d(erkig + + + + eT = —ihye hit 
and therefore 


h'ce 1 
Aaya nt +L = ig* oe Se V Nit elec ke Fo (365) 


e; is the energy of the emitted meson, and ro denotes the radius vector 
to the position of the particle. The element is associated with that 
g* which permits the transition P > N only. 

The emission of a negative longitudinal meson is made possible by 
the matrix B;, which furnishes the element 


Vn 1 kje™*0 365’) 


FD na- nein 4+1 = —¥W 


ie A 


The absorption element can be evaluated by the rule H’ 
H ty. ni+l1- 

The emission of a transverse meson is due to curl y or curl y*. For 
example, in the case of a positive meson, H’ n,;+,n,;+41 is given by 


r 
h c? = 
Haiti nitt = —f* an N eh V nit + 1 [6 curl (e,;e%) 6] 


For curl (e;;e"**) we find 


ni+1n = 


ikyr Be ae wai hei "iva ae 
curl, (ex;€ < ) = rs (ez€ i”) bisa’ aa (eye < ) = alk; x e;j)ze*** 
hence 


FL! nest nesta = if* ele myn" Vn ijt eat: (c: 7; * k,)e* i“ 


Similarly we obtain 


2 = Me 
He migs = fe L Vng $1 @ ey =kde* (866) 
2° /.; 


244 WAVE FIELDS AND NUCLEAR MATTER (Cu. 9 


And for neutral mesons we find 


h'xe 1 as, r 
Fanos = §LEL ae ni +1 1 kye™ ® 


2 =" 
Fl nssemest = «tf Var V nig + 1 (0° e4; = kyo 
2 BV, 


(367) 


From the formulas for a vector field we can easily derive those for a 
pseudovector field. According to (350), the interaction terms for the 
two fields are given by 

—K(goWaba* + 3f2Wapxas*) + conjugate complex 
—k($fsWapyXaby* + $9sWasWas*) + conjugate complex 
According to Section 48, the xas, form a pseudovector having the 

components 
x1’ = X23a0 Xa’ = X34 X38’ = X124— Xa! = —KX193 


Therefore we have 


’ 
EWahyXasy* = We34x1'* + War4xe!’* + wi2axs’* — Wiesxa'* 


That is, the y. are to be replaced by the x.’ and the we by wi,4 and 
—Ww123. Furthermore we saw in Section 50 that the tensor components 
Xas Of the vector field, in the case of the pseudovector field, are to be 
replaced not by Yas, but by the six-vector Was’, which is dual to Yas in 


the sense that Yio’ = ¥43, andsoon. Thus $wasxas* is to be replaced 
by 


FWapap’ * 
= wWeshai* + w3ibae* + wiowas* + waryes* + waovar* + wasir* 


With this new Lagrangian the calculation is to be carried out in the 
same way as for a vector field. Now, instead of 7;, we have 7,’ = 


aL /a(dx;//dt), which is found to be — (1 Vie)Wai!* . For the Hamiltonian 
of the interaction we obtain 


= [ do {f{—Ki(ex’*) + carazas div x] + g6fi(a curl x’*) 


o- xe(on’ )]}® + conjugate complex (368) 


For a particle at rest the expression reduces to 


H!’ = —fixlox’ *\o- gxc(ar’)o + conjugate complex (368’) 


Src. 64] THE POTENTIAL OF THE NUCLEAR FORCES 245 


64. The Potential of the Nuclear Forces. With the help of the 
theory developed in the two preceding sections it should be possible to 
solve certain problems with which modern physics is particularly 
concerned. One of these problems is the question of the forces with 
which the protons and neutrons of an atomic nucleus act on one 
another. The idea suggests itself of explaining these forces by assum- 
ing an exchange of mesons between the nuclear particles, an idea 
which, as we shall see, provides a potential that is in conformity with 
at least the most characteristic properties of nuclear forces, for exam- 
ple, in accounting for their extraordinarily small range. However, 
with decreasing separation of the nuclear particles, the potential 
becomes infinite as 1/r*, and such a strong divergence would exclude 
the existence of a lowest stationary state. Therefore we are com- 
pelled to accept the theoretical potential V(r) only for the region out- 
side a certain ‘cut-off’? radius ro, whereas for r < ro it is assumed 
that V(r) is either zero or V(ro). It is clear that this procedure of 
cutting off the potential at r = ro, which is rather arbitrary and, in 
addition, not relativistically invariant, excludes the validity of the 
theory for just that region within which the effectiveness of the 
nuclear forces is primarily limited. In addition to this fundamental 
difficulty, other problems arise. Are the forces mediated by charged 
or uncharged mesons, or are both kinds in action? There is experi- 
mental evidence (Tuve, Heydenburg, and Hafstad) that between 
two protons there is exactly the same attraction as between a proton 
and a neutron. This fact would suggest the assumption that the 
forces are due exclusively to neutral mesons (Bethe’s neutral theory). 
On the other hand, however, the radioactive decay can be understood 
only on the assumption that nuclear particles also emit charged mesons, 
and for this reason Kemmer proposed his “symmetric theory’ in 
which both charged and uncharged mesons are assumed to be effective. 
This theory, however, runs into the difficulty that the quadripole 
moment of the deuteron appears with the wrong sign. Again there 
is the question as to which of the four possible meson fields should be 
applied to the theory. It has become impossible to explain all experi- 
mental facts on the basis of one kind of mesons only, and therefore 
Moller and Rosenfeld suggested as a solution a resort to a combina- 
tion of vector and pseudoscalar mesons. The cosmic radiation is 
assumed to contain such a combination also, a radiation which, 
because of the different lifetimes of the two kinds of mesons, in the 
lower layers of the atmosphere should consist of pseudoscalar mesons 
only. 

Thus at the present stage the theory of nuclear forces still has a very . 


246 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


provisional character, the more so since the transmutation processes 
of the mesons seem to take place in such a way as to influence the 
mechanism. Nevertheless it may be of use to the reader to study at 
least the method of the theory, since he gets an insight into the diffi- 
culties that occur and at the same time becomes familiar with the 
concept of an “exchange force” which plays such an important 
part in modern physics. We may confine our investigation of the 
forces brought about by vector mesons; however, we must treat the 
cases of charged and uncharged mesons separately. 

(a) The Forces Mediated by Neutral Mesons. As neutral mesons can 
be exchanged between like particles as well as between unlike ones, 
the forces caused by them are independent of the nature of the nuclear 
particles. In order to evaluate these forces we consider two particles 
which are in interaction with a neutral meson field. Without the 
interaction H’, the system, which consists of the particles and the 
field, would, in a stationary state, possess an energy Eo composed of 
the energies of the particles and the field. By the interaction, which 
we consider a small perturbation, the energy is changed by correction 
terms which, according to Section 35, in the first, second, and higher 
orders are given by 


MT Be 
[ : oe ete. (369) 
De n 


n 


The index o indicates the state of the system to which the correction 
terms belong and may be supposed to correspond to a field containing 
no mesons (all n;; = 0), whereas the index n denotes an intermediate 
state into which a transition from the state o is possible so that H’ ¥ 0. 
Since the correction terms are due to H’, they give the energy called 
forth by the interaction, that is, both the energy with which the par- 
ticles act on one another by means of the field and the energy cor- 
responding to the forces exerted by the particles on themselves. 


According to (364), H’ is a linear function of y and 7 and is, there- 
fore, also a linear function of the matrices Q;; and P;;. As all the 
diagonal elements of these matrices are zero, the correction term 
H,o' = E' of the first order vanishes. The term E’, however, differs 
from zero. In the expression for EZ”, H,,»' corresponds to the transition 
of the system to a possible intermediate state. According to (356) 
the system is capable only of transitions from the state o (all ni; = 0) 
into those states in which the field contains only one meson of an 
arbitrary kind ij, whereas the nuclear particles remain unchanged 


Sxc. 64] THE POTENTIAL OF THE NUCLEAR FORCES 247 


(actually, they undergo a reaction which, however, may be disre- 
garded). Let us first suppose that the created meson is emitted by 
particle 1. The system then returns to the initial state (the factor 
Hno' in E*), whereby the possibilities arise that the meson is absorbed 
by either 1 or 2. The Z? that comes about in the first way occasions 
a self-energy of particle 1, whereas the interaction energy is effected 
by the second process. The latter energy, which will depend on the 
positions r; and re of the two particles, may be denoted by V. V is 
composed of two parts, V; and Vy, which are due to longitudinal and 
transverse mesons respectively. To evaluate V; and Vy we have to 
make use of the expressions (367). The denominator in (369) is 
the difference between the energies of the system in the states o and n 
and thus equal to the negative energy of the exchanged meson. If 
we set g equal to g* and f equal to f* in (369), we obtain 


242 2_ik,-(r,—r.) 
g°h*ck Byer irs 
Wee ~3 3 : i ae (370) 
2 me . * * M4 . os x . 
Va = aoe » (or eg SNCs 02 BO scion az 
FF 


The factor 2 is necessary since for any exchange there is an inverse 
one that furnishes the same contribution. 

We shall first use the expressions (370) and (371) to make clear the 
nature of an exchange force. A force of this kind must always occur 
if two particles have the possibility of exchanging a particle, regardless 
of the kind, and in this way transfer momentum to each other. How- 
ever, it must not be understood that the force comes about by a real 
exchange in a kind of ball playing; for it is only the potentiality of an 
exchange that matters, just as the potential energy of a lifted stone is 
due to the potentiality of its falling down. In the case of the stone, 
the energy is given by the product of the weight and altitude. In our 
case, in place of this product the perturbation theory provides the 
expression 

sath xo 
E, — En 


where n denotes an intermediate state created by the emission of a 
meson. In this expression, however, only one potentiality of an 
exchange is taken into account since it refers to a meson of a given 


248 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


momentum. To obtain the total energy we must consider all possible 
momenta Ak;, in this way arriving at the sums in (370) and (371). 

This corpuscular interpretation, however, is not adaptable to a 
certain circumstance which is of essential importance for an exchange 
force and becomes conspicuous only in the wave picture. In this 
picture a wave e™** of the wavelength \; = 2/k; corresponds to any 
exchange, and the potential energy is the result of the interferences of 
all these possible waves. It is because of this interference thatthe 
energy V depends on the distance of the two particles, an inconceivable 
fact in the corpuscle picture. 


To evaluate the sums in (370) and (371), in which ed and Ps denote 
the spins of the two nucleons, we replace the sums by integrals by 
using (343) for the number of the propagation vectors k; with a 
direction within the angle dQ and ak; between k and k + dk. Accord- 
ing to (343), this number is given by 


aa*.., | dV 
(Qec)®” (Qe) 


(without the factor 2, which had to be used in Section 59, for there were 
two directions of polarization for any k;). Because of the relation 
= he +/x? + k,*, (370) and (371), with r; — re = r, become 


k? dk 


dail “2a, [aa f a s* zee (372) 
Vu = -2 fe yf af ass e+e Fou e;=k)(a9: e;« k)e™** 
(373) 


The integral in (372) can be transformed if k* is replaced by k?(k? + 
k? — k*), thus becoming 


4 k? yd 
[as f avi -¢ fac f ase 


The first integral represents the sum of all e™*, which, by a well- 


known theorem that is valid for orthogonal functions, is given by 


13 = pI et eg = 6(r1 — re) = d(r) 


Src. 64] THE POTENTIAL OF THE NUCLEAR FORCES 249 


Thus the sum vanishes for any distance r ¥ 0 and consequently may 
be omitted. We then have for V; 


V1 Sharp me | a af ase ar ae 


The integral is a function of r = |r|. If we assume that particle 2 is at 
the origin of the coordinate system (rp = 0), by differentiating f(r) 
relative to the coordinates xyz of particle 1, we get 


kA iker 
Pe or Ie - ik-r 2 
=< fan f eae = 0) (374) 
The solution of this differential equation is f(r) = Ce~“"/r, where C is a 


constant. To determine this constant we evaluate f(r) for «x = 0, 
and we find that 


fo(r) = / dQ / dke** 
rT 0 t+) . 2 
= 2 f sin oao [, desire = ae f aoe 


Thus the constant is 2x”, and we have 


2,3 
ie e < 


Bide ’ tess (375) 


To evaluate Vi due to the transverse mesons (7 = 2, 3), we carry out 
the summation over j first. Since ej9x*k; =+h;e;3 and e;3xk; = 
+k,e;2, we find 


> (01+ es; * kj) (62 ex; * k,) = k*Io1|\09| {cos (c1ei2) cos (c2e:2) 
7 

+ cos (01e;3) COs (2e;3) } 
If in the braces we add and subtract cos fismen) cos fcuk the first 
three terms together with ke[o4||oo| give k2(o102), whereas the last 


term gives (oxk;) (oxk,), since e;; has the direction of k;. Thus we obtain 


2f°K My ay 
- [asf a sale _ (HAY) 


250 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


That part of the integral containing (64) can be transformed into 


k* a o> _—— k? = 


ee Cee 
—(o\09)k Qa are 


Il 


If in addition we take the direction of r = rj — re as the z axis and 
denote the angle (kz) by 6, we have 


k? 
if da ", dk 3 e®* (o1k) (oak) 
k? 
= fm emt. fige-e? sll = * + \@agke > °°) 


Since k,,” = k? sin? @ cos? ¢, the product o12022 gives the contribution 


Kha sb tang 
012022 / aa | sin® oao | do cos*get*  ? 


The integration over cos’ gives x. By resolving sin® 6 into sin 0(1 
— cos” 6) and making use of (374), we obtain 


i 5s fase wt [ sin 0 cos? acer 
0120 927 Kr 4 ws 2a 
To evaluate the last integral we differentiate the equation 
kA : ikr cos "i 
[a a | aa sin 0 reel =e — 


twice with respect to r, and we obtain 


4 I: ae 
— fa zs | a0 sin 0 cos? i = nto ($434.3) 


Therefore the contribution of oi2¢22 is 


‘2 2 
2,2 .—<xr f eee 
O120277 K e€ ( 9 23 
KT KT 


An exactly corresponding contribution is due to o1yo2y. On the other 
hand, from oj,02, we obtain 
kt 


are | d6 sin 6 cos” bet = 8 


cietae 2m [dk - 


1 2 2 
- ~cuentarte (14 24 3) 
rook Kr 


Src. 64] THE POTENTIAL OF THE NUCLEAR FORCES 251 


j _ 
The contributions of 01,09, and so on vanish. So, since o, = (or)r, 
we obtain 


Vu | Ge) (44 x ieee a) 


~ erlen (4 2 + 5) (376) 


The potentials V; and Vy are quite different. V1, which arises from the 
longitudinal mesons, is independent of the spins, whereas V;, depends 


on the spins o; and o of the two particles. Vu is composed of two 
parts. The first part is spherically symmetric, that is, it corresponds 


to a central force, and is proportional to (e102), thus it is positive for 
parallel spins and negative for antiparallel spins. The second part is 


proportional to (cr) (cer), that is, it depends not only on the magnitude 
but also on the direction of the distance 7, so that the eigenfunction 
of the system consisting of the two particles loses its spatial symmetry, 
and the same holds for the charge distribution. Indeed, in the case 
of a deuteron, the existence of such an asymmetry was shown by 
Kellog, Rabi, Ramsey, and Zacharias. 

(b) The Forces Mediated by Charged Mesons. In the case of the 
second approximation as considered here, charged mesons can act 
only between unlike particles because an exchange of charged mesons 
between like particles is impossible. The exchange is necessarily 
connected with a change of the charges of the particles, the proton 
being bound to transmute itself into a neutron, and vice versa. There- 
fore we cannot calculate the interaction energy of the second order 
E? in the same way as for neutral mesons. For them we imagined 
that the system is brought from its initial state A into an intermediate 
state Z by the emission of a meson by particle 1, whereupon the meson 
is absorbed by the other particle 2. In this way the initial state is 
restored. But if we apply the same procedure to a charged meson, 
we get a final state E which differs from A since the two particles have 
interchanged the charge coordinates 7. Therefore the quantity 


Haz'Hzx' 
: Es — Ez 


€az = 


has not the meaning of an energy correction but rather determines the 
probability of a transition of the system from the state A into the 
state E. As we shall see, we can, however, form a matrix from the 


252 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


ean and determine the energy corrections by bringing this matrix into 
diagonal form. For this purpose, proton and neutron are considered 
two modifications of the same particle, the ability to appear as a 
proton or as a neutron being described by a “‘charge coordinate” 
7. The proton state may be denoted by 7 = 1, and that of the neutron 
by 7 = 0. We must then coordinate in the Hilbert space two axes, 
t = 0 and 7 = 1, toa nucleon. If the vector representing the state 
lies in the direction of one of these axes, the particle is with certainty 
a neutron or a proton, but otherwise there is only a certain probability 
of finding a neutron or proton by a measurement of the charge. Since 
we are now dealing with a system consisting of two nucleons, we need 
four axes, each of which belongs to a value of 7; and r2 so that the axes 
are to be marked by two indices. If we simply write a, b, c, d for 00, 
01, 10, 11, a vector in the direction a means that both the first and 
second particles are neutrons; b indicates the state wherein 1 is a 
neutron, 2 a proton, and so on. 

We now consider the transitions between the states a, b, c, d and 
correlate an eaz to each of them. A transition between two of the 
states is only possible if either of the particles 1 or 2 emits a meson 
which thereupon is absorbed by 2 or 1. The meson may be positive 
or negative, longitudinal or transverse, so that, according to these 
possibilities, 
= H az zr’ 
€AE pe Ez 
can be resolved into four partial sums. For the H4z’ and Hze’ quanti- 
ties, the expressions of (366) must be substituted in which a factor g 
or f occurs. According to (358) these factors are matrices by which 
only certain transitions are permitted. In what follows we shall 
denote the factors belonging to particles 1 and 2 by gifi and gof2 
respectively. If A— Z means that a positive longitudinal meson is 
emitted by 1, and Z— E that it is absorbed by 2, the factor of the 
product Huaz’Hze’ is gi*g2. For a negative meson the factor is 
gig2*. The contribution of a longitudinal meson to e4z therefore 
carries the factor gi*g2 + gig2*, which must be taken twice, since the 
meson may as well be emitted by 2 and absorbed by 1. We shall now 
denote the “isotopic spin matrices” ; and 2, defined by (358’), by 
7” and 7 in order to indicate that they are now used with reference 
to the charge coordinate and not to the spin. Then, denoting by g’ 
an ordinary number, we have 


g = g(r et ir) g* = 9 *E(r? i ir) 
Ps 91*92 a 9192" Le lg’|? $ (7 Pre P =e 71 r9) 


Src. 64] THE POTENTIAL OF THE NUCLEAR FORCES 253 


The factors g;*g2 and gig2* are not to be read as products of matrices 
in the sense of (120); they represent four-row matrices operating in 
the space scaffolded by the axes a = 00,b = Ol,c=10,d=11. gy 
operates on the first, g2 on the second, so that, for example, (9192*)as 
is to be understood as (g1)o0(g2*)o1. Since only go; and gio* differ 
from zero, g1*g2 + gig2* represents the matrix 

0000 
0010 
at. O..0 


g1*g2 + 9192* = |g’? 4 (ry P rg? +7; 792) ed |g’? (377) 


0000 


The meaning of this matrix factor is that charged mesons can cause 
transitions only between the states b and c (for e4z determines the 
probability of the transition A — £), that is, one particle must be a 
proton, the other a neutron. Aside from this ‘exchange factor,” 
the evaluation of eaz is the same as that of the expression (369) for EZ? 
with which e4z conforms if we disregard the charge. Thus we obtain 
for €aa, €ap and so on the matrix 


es " 0000 

_1(Ig9’ fi 1 etedete 

7-3(4 Vit" Ve)lo 1 0 0 (378) 
0000 


V; and Vy: being defined by (375) and (376). The factor 4 is neces- 
sary because, according to (366), the elements of H’ for charged mesons 
are 1/+/2 times smaller than those for neutral ones. The distinction 
of g’, f’, and g, f means that these factors may be different for charged 
and neutral mesons. 

The eaz defined by (378) cannot as yet be interpreted as energy 
corrections, since the initial and final states are not identical. But 
by a rotation of the coordinate system, that is, a passage to other axes 
a’, b’, c’, d', we can make the matrix in (378) diagonal. Then we 
obtain four quantities €a'a’, €'b’, €c’c’; €a’a’ Which may be considered 
energy corrections of the states a’, b’, c’, and d’. 

To carry out the plan we determine first the eigenvalues of the 
matrix in (378) by solving the equation 


—a 0 0 0 
ete 
0 0 0 —a 


The solutions are a = 0, which occurs twice, and a = +1. These 
eigenvalues belong to four orthogonal directions a’, b’, c’, d’. If 


254 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


e; (¢ = 1, 2, 3, 4) denotes four unit vectors in these directions, and 
zz (k = a, b, c, d) their components relative to the axes a, b, c, d, the 
e;, as eigenvectors of the matrix 


oooo 
oro © 
ooro 


must satisfy the equations Me; = a;e;. For a = 0 it follows from 
these equations (the two corresponding eigenvectors e; and e, may 
belong to the directions a’ and d’) 


xe? = © xp? = () 2. = (0 x,” =(0 


That is, the directions a’ and d’ lie in the a-d plane and may be chosen 
so as to coincide with a and d. Thus if a vector has one of these two 
. directions, both particles are either protons or neutrons. The cor- 
responding €q’q’ and eaa are zero, in accordance with the fact that there 
is no interaction between two like particles. For a = 1, belonging 
to eg = b’, however, we obtain from Me2 = e2 


te? = ay te? = 2g = 0 
Thus b’ lies in the b-c plane and has equal components relative to the 
band caxes. Therefore a vector with the direction of b’ represents a 
state which is symmetric in the charge coordinates 7; and rz. The 
probability amplitudes for the statements 1 = proton, 2 = neutron, 
and vice versa, are equal. 

Finally for the eigenvalue a = —1 (e3 = c’) we obtain, from Me; = 
—ey, 

® ze = zg =O 

corresponding to a state which is antisymmetric in 7; and 7». 

Thus the interaction between two nucleons, brought about by 
charged mesons, can be described by the eigenvalues of the matrix 


1 4|2 112 1 
7 ; (2 V: a a ate (ry Pro 4 7° rg") (379) 


Two of these eigenvalues are zero, and the other two are given by 
Lilly pee ) 
+=(2L EL 
‘aie 3( 9” Vi + f? Vu 


in which the positive sign is to be taken for a state which is symmetric 
as to charge, and the negative one for the antisymmetric state. 


Src. 64] THE POTENTIAL OF THE NUCLEAR FORCES 255 


(c) Comparison with Experience. We wish here to draw a brief 
comparison between the results at which we have arrived and the 
formulations reached when experimental facts are included in a purely 
phenomenological theory. From the facts we learn, at least to a 
certain approximation, that between two nucleons a central force of 
very small range must be acting, the potential of which can be repre- 
sented by e “/r. Furthermore it seems probable that the interaction 
is due to an exchange force, that is, it comes about by a mechanism in 
which the charges or spins of the two particles are interchanged. 
Formally, this is to be expressed in a way that the potential, when it 
acts as an operator on the state, interchanges the values of one or both 
of the observables mentioned above. An example of an operator of 
this kind is the matrix 14(71"P 72" + 7172), which has the property 
of multiplying a symmetric or antisymmetric function describing 
a state by 1 or —1 respectively; this means that the effect of the 
operator must consist of an interchange of the charges. The effect of 
V6 (7y Pro + 717.) on the symmetric states a = 00 and d = 11 
is zero, from which it must be inferred that a potential of the sort (379) 
holds only for unlike particles. If the potential is required to include 
the forces between like particles as well, the exchange factor must be 
changed to 


HL + 1 Pra? + 14 Pra® + 1% re) = $0 + (rir) 


is r denotes the isotopic spin matrix 


LS 
Sind : h —1 
(3) (3) 


71 T2” is a diagonal matrix with the elements 1, —1, —1, 1 which, 
added to 11? ro + 7172, furnishes a matrix with the eigenvalues, 
1,1, —3,1. Therefore an exchange force with the factor (1 + (ie) /2 
works also in the state a = 00 and d = 11 and is used in Kemmer’s 
symmetric theory. 

In the same way as (1 + ee) /2 acts on the charges, (1 + (eure) /2 
operates on the spins of the two particles. This operator multiplies 
the three states which are spin symmetric by 1, and the antisymmetric 
state by —1. According to (241) the o“ are defined as four-row 
matrices each of which is built up in two equal steps: 


() (2) (3) 
0) = li 2 | og) = i 20 | og = Ef 20 | 


256 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


The two steps correspond to the states of positive and negative energy. 


As we are dealing only with the spin here, we may simply takerf ore. 
Thus there are the following three possibilities for an exchange force 
with the potential e~’/r: 


Pe yd <p ont 
Ps [1 + (c102)] coarttete) 


(ule ——> 
r {1 + (7172) | (iteisonberg) 


b 


€ 


— [1 + (rira)]LL + (e102) ttm) 


c 
If we take as a fourth possibility a force without exchange, all four 
types can be combined in the general formula 

on 


r 


V = — [a + b(oies) + clrirs) + Aeie2)(rir2)] (380) 


wherein the constants a, b, c, d can be chosen in such a way as to arrive 
at the best possible agreement with experience. 

Now the theory of the meson does indeed furnish a potential the 
range of which is limited to r~1/kx by the factor e~”. «x can be 
calculated from the mass my of the meson with which it is connected 
by the relation mp = fix/c. For an mo of nearly 200 electron masses, 
a value of 2 X 107! em is found for 1/« which is in good agreement 
with experiment. In addition it is seen that the potential (379), 


when, according to Kemmer’s symmetric theory, (7172) is substituted 
for 71P rg? + 7172, takes on a form that to a certain extent is in 
conformity with (380) provided we seta = 6 = 0. It consists essen- 
tially of a Heisenberg and a Majorana force in which the first one, by 


means of at exchanges only the charges and the second one, by 
means of (ois) Gute), exchanges the spins as well. There is, however, 


in (379) a further term with the factor (or) (oor) which represents a 
tensor force that is dependent on the spins and is not foreseen in the 
phenomenological theory. This term is important for the explanation 
of the quadripole moment of the deuteron. As is seen from (376), 
its expression contains a term that depends on r like 1/r*, and this 
divergence makes the theory a failure, for it can be shown easily that, 
in the case of an attraction with a divergence greater than 1/r”, no 
stationary state of lowest energy exists. In order to prove this we 


Sec. 65] NUCLEAR SCATTERING OF MESONS 257 


imagine that one of the two nucleons is fixed in space. Let r be the 
radius of the orbit belonging to the stationary state, that is, the radius 
of the region within which the wave function of the movable particle 
differs noticeably from zero. According to the uncertainty relations 
the momentum would then be of the order of magnitude h/ r, and from 
this it can be inferred that the kinetic energy has to be of the order of 
1/r?. This means that, if r is decreased, the decrease of the potential 
energy would outweigh the increase of kinetic energy, the effect being 
that the total energy would diminish. Thus for any ra transition to 
a smaller r has to be possible, that is, an atomic nucleus of finite radius 
could not exist, in obvious contradiction to the experimental facts. 

By combining a vector and a pseudoscalar field and by choosing the 
constants in a suitable way, we could make the term with 1 /r® vanish, 
but in this case the effect would be that the tensor force required for 
the explanation of the quadripole moment would vanish as well. 
Thus the solution of the nuclear problem is impossible as long as the 
divergences of the potential cannot be removed in a relativistically 
invariant way (cf. Chapter 10). 

65. Nuclear Scattering of Mesons. Mesons are deviated from 
their straight paths when passing through matter. These deviations, 
like those of photons which are caused by electrons, come about in a 
way that a nucleon absorbs a meson of given momentum Ak and in 
return emits another of momentum Ak’. In the following we shall 
treat the process of elastic scattering only, for which the momenta 
hk and fk’ of the absorbed and emitted meson are of equal amount. 
For this purpose we suppose that the nuclear particle is infinitely 
heavy so that it does not undergo a noticeable change in energy by 
the absorption and emission of a meson. 

For the treatment of scattering processes, according to (186), we 
must begin with the evaluation of the quantity 


A. gz' Hz," 
H' = ) —#" 
reas 


which determines the probability of a transition A > E through an 
intermediate state Z. We consider an initial state A in which, besides 
the nuclear particle, only one vector meson of momentum hk is present ; 
in the final state this meson is assumed to have disappeared, making 
room for another of momentum hk’. The transition is effected in 
such a way that the nucleon absorbs the first meson and emits the 
second one. If first we consider the case of a neutral meson, there are 
two possibilities: 


258 WAVE FIELDS AND NUCLEAR MATTER [Cx. 9 


(i) The meson hk is absorbed first, and then the meson hk’ is emitted. 
The system then uses an intermediate state Z; in which no meson is 
present. 

(ii) The above processes occur in the reverse order. In this case 
both of the mesons are present in the intermediate state Z». 


The matrix elements of (367) are to be substituted for Hz’ and 
H,,'. If we consider a longitudinal meson and make the extension / 
of the cubic space equal to unity, we shall have 


[a Soke 
Han! = Hay’! = ig — ke** 
€ 
. hxe ik’: 
Hz,2' = Haz,’ = 7 , k'e Ke 
€ 


Furthermore, if « and e’ denote the energies associated with the 
momenta Ak and hk’, 


E, — Ez, =e E, — Ex = —é 


For an elastic process we have k = k’ and e = ¢’, so that we obtain 


Wy’ = Hazs'Haye’ — Haze’ Hae’ 90 
€ € 

That is, the probability of an elastic scattering process for neutral 
mesons is zero, because the contributions of Z; and Z»2 cancel each other. 

On the other hand, for charged mesons the cross section turns out 
to be different from zero. If, for example, we take a proton and a 
positive (longitudinal) meson, then, since the proton is not able to 
absorb the meson, the scattering process can take place only through 
the intermediate Z» state, and according to (365) we obtain 


Haze! H ax’ hex : , 
7? (es an gy’ ; ket k’)-r 


€ 


Apart from the sign, the same value holds for a negative meson which 
can be scattered by a proton only through an intermediate state 
Zi, since the proton cannot emit a negative meson. The probability 
dW of a meson being scattered in unit time so that the momentum 
hk’ of the scattered particle lies within a certain angle dQ, according 
to (188), is given by 


dW = = |H’|*p dQ 


Sxc. 65] NUCLEAR SCATTERING OF MESONS 259 - 


where p denotes the density which, when multiplied by dE, gives the 
number of final states which have an energy between H and £ + dE. 
The number of meson states having a momentum of amount lying 
between k and k + dk and of direction within dQ, for 1 = 1, according 
to (348) is given by 

dQ 


anya b? a 


Thus we have 


p dE k? dk 


~ (2m)? 
Since « = EZ = ch Vx? + k’, then 


aE chk dk _ chk dk 


V 2 + k? € 
Thus p becomes 
iit ek 
0 Or) c?h? 
For dW we obtain 
_ mg tlcint gL ck 
UF ncttae ako biekteeite , 


In order to find the cross section dg we set dW = J dq, J denoting the 
intensity of the meson radiation, that is, the number of mesons passing 
per unit time through unit surface perpendicular to the radiation. 
As the space with / = 1 is assumed to contain only one meson, J is 
identical with the velocity v of the meson. From the relation mov/ 
V1 — v*/c* = hk we obtain for this velocity 


hk? 


a= — 
€ 
so that for dq we obtain 


ee ee (381) 
(2r)c? aa | 
Thus the cross-section for charged vector mesons should increase 


infinitely with increasing energy. However, in this case we arrive at 
values which, compared with the measured values} for energy above 


+ F. L. Code, Phys. Rev., 50, 229 (1941), and R. P. Shu, Phys. Rev., 61, 6 (1942). 


260 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


10° ev, turn out to be far too high. According to measurements by 
Code, the cross section for mesons with an energy of 0.8 X 10° ev is 
about 0.6 X 10727 cm?, whereas the theory gives one of 0.6 x 10~** 
em’. A better agreement with experiment is had on the assumption of 
scalar or pseudoscalar mesons. Then the factor k* in the numerator 
drops out, so that the cross section should diminish rapidly with 
increasing energy. Whether this is true can be decided only by further 
measurements. 

66. Magnetic Moment of Proton and Neutron. Since both 
the proton and neutron have the spin 4, then, according to Dirac’s 
theory, it should be expected that the magnetic moment of the proton 
is a nuclear magneton ef/2Mc, where M is the mass of the proton. 
On the other hand, the moment of the uncharged neutron should be 
zero. Actually, however, measurement furnishes 2.785 magnetons for 
the proton and —1.935 for the neutron. The explanation of these 
unexpected values must be sought in the transmutability of the two 
particles. Let us suppose that a proton is brought into a weak mag- 
netic field. The spin will then take an orientation parallel to the field, 
thus giving rise to a magnetic moment of the same direction and magni- 
tude of a magneton, but the moment will change when the proton 
transmutes itself into a neutron by the emission of a positive meson. 
By an investigation of the transition probabilities} it can be shown 
that a proton in a magnetic field can only emit a meson the spin of 
which has the direction of the field. By the emission, therefore, the 
particle takes on a spin —}4 which, since the particle is no longer 
charged, is not able to produce a moment. Instead of the particle, 
the emitted meson has a moment which by measurement is ascribed 
to the proton. From the probability with which per unit time a proton 
is transmuted into a neutron by the emission of a meson, the fraction 
a of the time during which the particle is a neutron can be evaluated, 
whereas in the remaining time, 1 — a, it is a proton. Then in the 
time 1 — a the moment, expressed in nuclear magnetons, is unity. 
In the time a, however, it will be M/mo, so that the measurement will 
give a moment of the magnitude 1 — a+ a(M/mo). The possi- 
bility that the transmutation is due to the emission of a scalar or 
pseudoscalar meson can still be considered. The moment of the meson 
then is zero, and for the moment of the proton we obtain wp = 1 — 
a — 8+ a(M/mpo) if B represents that part of the time during which 
the nucleon exists as a neutron plus a scalar meson. If we let up 

+ For this purpose, we have to expand the emitted field in terms of spherical 


waves. Cf. H. Frohlich, W. Heitler, and N. Kemmer, Proc. Roy. Soc. London, 
166, 154 (1938). 


Src. 66) MAGNETIC MOMENT OF PROTON AND NEUTRON 261 


equal 1 — a + a((M/mo) — (8/a)), at any rate we shall have B/a < 
M/mo and a K a(M/ms), so that we obtain up ~ 1 + a(M/mo). 

The probability of a transmutation of a neutron into a proton by 
the emission of a negative meson is the same as that for the inverse 
transmutation of a proton. Hence, during the time a, the neutron 
will be a proton. In this case the spin is —}4 and the total moment of 
particle and meson is —1 — M/mo, so that uy = —a — a(M/m) ~ 
—a(M/mo). From this simple consideration it is readily seen that 
by the transmutation of the nucleons the proton obtains a moment 
up > 1, and the neutron a moment py <0. We should expect pp + 
by ~ 1, a relation that, at least approximately, is actually confirmed. 

In the rigorous theory we have to evaluate the self-energy of a 
nucleon which arises from the interaction with the corresponding meson 
field when a weak magnetic field H is acting on the nucleon. Accord- 
ing to (369) this energy is given by 


Hon'H no 
E? = on 44 no 
E, — En 


In the transition 0 — n, a nucleon at rest emits a meson the spin of 
which has the direction of H and which in the transition n— 0 is 
reabsorbed. Therefore H, — FE, is essentially equal to the energy of 
the emitted meson consisting of the part he V x + k? and the energy 
—(eh/2moc)H which is due to the magnetic moment eh/2moc. Strictly 
speaking, we should take into account also the magnetic energy of 
the proton, which energy disappears in the transition 0 — n, but we 
may neglect this since we are considering only the effect of the emitted 
meson. Accordingly we must substitute in the denominator of the 
expression for E? the quantity —[hc Vx? + k? — (eh/2moc)H]. By 
expanding in terms of the field strength H, assumed small, we get a 
series Wo — w/H + - - - which by means of the factor y’ of H gives 
the desired supplementary moment, since the energy due to H is 
given by the product of field and moment. In order to evaluate y’ we 
must take the expressions (366) for Ho,’ and H,,’ and, as in Section 
64, replace the sum by an integral. In this calculation we have to 
keep in mind the fact that in the presence of a magnetic field only 
mesons with spin in the direction of H can be emitted. We then get 


ER? = — Lilt ‘ dk Mec Bec rdhlcee Met. 
(Vx? + kh? — eH /2moc?)? 


If we neglect the terms with H’, the denominator in the integral term 
can be written 


262 WAVE FIELDS AND NUCLEAR MATTER (Cu. 9 
*. ; 2 eH 
e+e - “i vaqpenere—S 


wherein Vx? +k? is put equal to « = moc/h. On multiplying 
numerator and denominator by x” + k? + eH/he we obtain 
Uteett fy, 


dk 


ee ad nee i ee 
Bait 3n°c he (x? + k?)? 


and thus the supplementary moment sought after is 


2, k* 
gt = LdlPne = ee dee 


~ Br ch 


A similar calculation is to be carried out for the neutron, the only 
difference being that the neutron emits a negative meson the magnetic 
energy of which is + (eh/2moc)H, so that we get the relation uy’ = — pp’. 

The integration in (382) is from k = 0 tok = , which leads to an 
infinite value of y’. As we have seen in Section 60, this result is 
inevitable, according to the perturbation theory, when a quantity is 
calculated to which any wavelength of the field furnishes a contribu- 
tion. As long as quantum mechanics is not amended by a principle 
that limits the wavelengths in a relativistically invariant way, equa- 
tion (382) can be made convergent only by means of a “cutting-off”’ 
process wherein the upper limit of the integral ko is of the order of 
magnitude x. One then obtains for yup’ a value the order of magnitude 
of which corresponds to experiment. A method which provides the 
same result without any arbitrariness is to be discussed in Chapter 10. 

67. Mesons in an Electromagnetic Field. Mesons play an 
important role in cosmic radiation, the hard component of this radia- 
tion, with good reason, being supposed to consist of them. When 
passing through the atmosphere, the mesons undergo different proc- 
esses, those in which an electromagnetic field is acting being of 
particular importance. Thus we have to investigate the behavior of 
mesons in an electromagnetic field with special consideration for the 
case of a light wave. Equivalently, the question concerns the way 
in which the wave equation of the meson must be changed when an 
electromagnetic field described by the four-potential (66,63; = 
A, ®4 = 7¢) is present. According to Section 49, in this case the 
operators 0/0x2, with x4 = ict, are to be replaced by 0/drq + (ie/hc)®a 
and 0/dx. — (ie/hce)®, respectively. Taking a vector meson field as 
an example, instead of (257), we obtain the equations 


Sxc. 67] MESONS IN AN ELECTROMAGNETIC FIELD 263 


“ 
(eee) (tana 


Puig ; (383) 
ae ie 
and, instead of (258), 
(2 a te .) = wf 
(384) 


mS =n te Rcd * 
(,". he ) xa “ve 


The latter equations, in which xag are defined by (383), can be derived 
from the Lagrangian L = / dv L with 


jie es 5 Xat Xap — Khe*Wa (385) 


L being understood as a function of the independent variables y, and 
¥.*, for by varying one of the variables ¥g* into ¥g* + dys*, we obtain 


fa J dv éL Lend fa / dv S KXag 8xXap* + Kp ive") 

1e€ 
yp dt J dv » Xap (2 we? s.) bpa* + Kbp ivs* | 
-fafw oe ox v2 axes = obs) dpa* 


The summation is not to be extended over 8, which now has a definite 
value, and thus in the term with xag 5xag* the factor 44 drops out. On 
setting the expression equal to zero, we obtain (384). 

The momenta zz and 7;,*, conjugate to y;, and y,*, can be deter- 
mined from (385). Then we obtain 


1 
Te =e Xa oH — aXe TH = TA* = 0 (386) 
uw uw 


264 WAVE FIELDS AND NUCLEAR MATTER [Ca. 9 


And for the Hamiltonian we obtain 
3 

5 igual dbx dp* | ae 

ay Gnt wom Tt 


3 
= ef Tk [eter + ; (Bal, — Depa) + tc 21) 
ees | 
3 


+ 7,* ent a (Sal* — Dypa*) + tc an") cr, ry 
i 


5 le Bebe aal 
t= aL ie ve* re 
0 te F) ie 
ae; * he <4.) al” (2 4 i a) 4.| + Ke"Ye (387) 


The quantities ®; describing the electromagnetic field are here con- 
sidered as given space-time functions, that is, we disregard the reaction 
of the mesons on the field. If, according to (384) and (385), y4 and 
y4* are replaced by 


3 3 
ic fe) te ic (i) te 
=e — — @® * e—- _ — (2 +24) 
V4 Dili k) Tk V4 a ja Tk 


then (387) represents a function of ~1, Ye, v3 and 7, 72, 73 together with 
the conjugate complex quantities. In the interaction between the 
mesons and the electromagnetic field, only those terms are of conse- 
quence which contain the potential ®. The totality of these is given by 


~ 


H! = 5 Gal(ry) — (eV) — 5 [vale) — vat") 
nae eae x Yt curl ¥) — @ X ¥-curl ¥*)] 
+ ie 5(@ X98 XW) = Fed) — GV) 
t. = « [(e®) div x* — (#*6) div x] — —- (rb) (x*) 
— S0@ x y*- curl ¥) — @ X¥-curl ¥*)] 


4+ @xv- xv) (388) 


Suc. 67] MESONS IN AN ELECTROMAGNETIC FIELD 265 


The meaning of this expression becomes immediately clear if both the 
electromagnetic field @ and the meson fieldy are quantized. According 
to (802) and (316), we must in this case substitute for ® 


ie > cei yee — Xie) (389) 
a3 


(In this we use the notation X;; instead of A; and take into account 
the fact that B, = A_;.) In the summation relative to j, it is from 2 to 
3. The time field component 64 which may be considered an external 
force does not enter into the quantization. On the other hand, accord- 
ing to (288) and (291), we have 


v = ) Cen Ass — By;)e*s" "= » cic Aes + B,;)e—*** (390) 
1j ij 


If in (888) we substitute these expressions, for H’ we obtain a matrix 
that represents a sum of matrix products. It is sufficient to consider 
the products that arise from one of the terms of (388), for example, 


from (ro) div r*. According to (389) and (390) we have 


[ a0 (we) divxt = [ do Cam(esseiy)(Xise™™ — Xye*) 
xX (Ay + Biy)e™ (Ami + Bmie™™ (391) 


Ciim denoting a factor that depends on 7, 1, m. To comprehend the 
significance of the matrix we must consider that, because of X;;, 
A;;, By; together with the conjugate complex matrices, only the 
following transitions are permitted: 


X;;: Creation of a photon of the kind 7 
ij: Annihilation of a photon of the above kind 
A,;: Creation of a positive 77 meson 
A;;: Annihilation of a positive ij meson 
B;;: Creation of a negative 77 meson 
B;;: Annihilation of a negative 77 meson 


The meaning of the products in (391) is immediately clear. For 
example, the term with the product X;;41;A mi determines the proba- 
bility of a process by which an 77 photon together with a positive 77 
meson disappears and in return a longitudinal positive meson is 
created. The term X;;By;Ami belongs to a process in which a dis- 
appearing photon makes room for a positive and negative meson, 
and so on. 


266 WAVE FIELDS AND NUCLEAR MATTER [Cu. 9 


With any product occurring in (391), a power of e belongs as a 
factor. For example, in the product X;;41jBm1, the factor is 


ie Hee ®) 


the integral of which differs from zero only if k; + k; = km, that is, if 
the momentum of the two disappearing particles is equal to the momen- 
tum of the created meson. Thus the only transitions which are 
possible are those in which momentum is conserved. 


PROBLEMS 


1. Calculate the products wo, w1, etc., defined by (347). 

2. Calculate the invariants (349), making use of the results of Problem 1. 

3. According to (373) and (374), the potential of the nuclear forces comes about 
by the interference of waves. How must the spectrum of the waves be limited in 
order that the potential become approximately constant between r = 0 andr = ro? 

4. Discuss the various terms arising from the evaluation of (391). 


10 
INTRODUCTION OF 
A FUNDAMENTAL LENGTH 


68. The Idea of a Fundamental Length. In the preceding 
chapters we saw that, when we attempt to apply the principles of 
quantum mechanics to the electromagnetic field or some other wave 
field, we are faced with the difficulty that certain definitely finite 
quantities such as the self-energy of the electron or the energy of a 
cavity radiation become infinite. Some divergences, for example the 
infinite Coulomb self-energy of a point charge, are not associated with 
the idea of quantum mechanics but originate in the point character of 
the particles and hence have already appeared in the literature of 
classical physics. In what follows we shall not refer to this sort of 
divergence but will confine ourselves to those due to quantum mechan- 
ics. These arise in such a way that the theory is forced to admit 
waves of any small wavelength and so account for a field with an 
infinite number of degrees of freedom. A typical example of this 
kind, although of little moment, is the infinite energy which, accord- 
ing to the theory, should correspond to a cavity radiation. Consider 
the radiation enclosed in a cubic box. Here, in evaluating the zero- 
point energy, the theory must consider any stationary wave which is 
adaptable to the dimensions of the box, its realization, therefore, 
being possible. But this condition is fulfilled for an infinity of waves 
with infinitely decreasing wavelengths. Therefore a zero-point energy 
hv/2 must be assigned to any of these waves (cf. Section 56), and as a 
result the summation gives an infinite value for the whole energy. 
All the divergences that occur in quantum mechanics, in so far as they 
are not due to the point character of the particles, are of this kind, 
as can be seen, for example, upon evaluating the transverse self-energy 
of the electron (cf. Section 60), or that of the magnetic moment of 
the proton or neutron (Section 66). There can be no doubt that the 
only remedy for these divergences is the introduction of a principle 
that limits the number of waves by removing the effectiveness of 
those waves with a frequency exceeding a certain limit. However, 

267 


268 INTRODUCTION OF A FUNDAMENTAL LENGTH — [Cu. 10 


this principle must satisfy the condition for relativistic invariance. 
For example, we must not simply decide that waves with frequencies 
above a certain limit are ineffective, since we can increase or decrease 
a given frequency by any amount by transferring to another coordinate 
system, the consequence being that the limit would be dependent on 
the choice of a reference system. For this reason we must try to 
attain our end in such a way that we start from the theory, as devel- 
oped in the preceding chapters, which we know satisfies the require- 
ments of relativity, and modify this theory by means of an invariant 
correction which has the effect of rendering ineffective waves of 
extremely small wavelengths. 

Before proceeding with this plan, let us explain first the physical 
meaning of the method to be followed in settling this problem of 
divergences. Evidently such a procedure must be based on the intro- 
duction of a universal constant which, in a manner similar to Planck’s 
constant h or to the velocity of light constant c, limits the possibility 
of observation. The present situation in quantum mechanics is some- 
what similar to that which existed in physics prior to the development 
of the theories of relativity and quantum mechanics. In both of 
these cases the old theory turned out to be a failure because it was 
based on an incorrect estimate of the possibilities of measurement. 
Pre-relativistic theory held the view that the simultaneousness of 
events, occurring at different places, may be defined by means of 
signals transmitted from one place to another with an infinite velocity, 
whereas this velocity actually is limited by the constant c. On the 
other hand, classical mechanics considered it possible to measure the 
values of two conjugate observables simultaneously with any desired 
degree of accuracy. Relativity theory and quantum mechanics 
originated from the rectification of these prejudices. And so we may 
presume that it is again an incorrect estimate of the possibilities of 
observation which introduces intrinsic difficulties into quantum 
mechanics. The question that arises this time is whether it is possible 
to ascertain the position of a particle as accurately as we wish or 
whether the accuracy is limited by a universal constant. As a matter 
of fact, we shall see that the alteration of the mathematical formalism 
suggested in what follows corresponds to the thesis: It is, on principle, 
impossible to invent an experiment of any kind that will permit a dis- 
tinction between the positions of two particles at rest the distance of which 
is below a certain limit ly. In other words, we cannot ascertain the 
position of a particle at rest with a greater accuracy than with a possi- 
ble error 1p. It should be noticed that this theorem applies to particles 
at rest, that is, the particles must be known to be at rest before the 


Sxc. 69] INTRODUCTION OF Ih 269 


measurement is made. The constant J) has the dimension of a length 
and is, as we shall see, of the same order of magnitude as that of the 
radius of the classical electron, with which, however, lo is not identical. 
In the following we shall call J’ the fundamental length and consider it 
a constant of first order which, like Planck’s constant h and the light 
velocity constant c, limits the possibilities of observation. 

69. Introduction of l) into the Interaction Terms. The prob- 
lem to be solved is the introduction of Jp into the theory in such a way 
that the divergences caused by the quantum-mechanical formalism 
disappear without, at the same time, destroying the invariance of the 
theory. For this purpose we consider a system consisting of a field 
of some kind and particles. The field may be described by a scalar 
or tensor function of the coordinates and the time, that is, by one of 
the y functions (with one or more components) discussed in Chapter 
7. As for the particles, we are concerned solely with electrons and 
nucleons, that is, with particles of spin 144 which can be represented 
by a Dirac function @ with four components. The physical nature of 
the system can then be described by a Hamiltonian composed of three 
parts, one referring to the field, another to the particles, and the third, 
denoted by H’, corresponding to the interaction between the field 
and the particles. We are concerned only with the latter part since 
all the divergences here are due to H’. With a few exceptions H’ 
depends on y linearly. Interaction terms which do not agree with this 
condition may be disregarded at first; they will be discussed later. 
As examples we may consider the interaction (326) of an electro- 
magnetic field Y = A; upon an electron: 


$8 de alec) (326) 


or that of a pseudoscalar field y upon a nucleon: 


H!' = —ig(o grad y*) — ig*(o grad y) (360) 


In these expressions A; and y are meant as representations of wave 
fields. In order to take into account the corpuscular character of the 
field also, which is manifested by the appearance of photons and 
mesons, we have to quantize the field by expanding A; and y into 
series > ae and transcribing the coefficients g; into matrices 
Q;. As we have seen, the field then changes into a system of oscil- 
lators the behavior of which permits a very simple description using 
the matrices A; and B;, introduced in Chapter 7. Associated with the 
unperturbed system of field and particles is a principal system K in 


270 INTRODUCTION OF A FUNDAMENTAL LENGTH — [Cu. 10 


the Hilbert space the axes of which have the following meaning. 
When the vector x, which represents the state of the system (for 
which H’ is assumed to be zero), has the direction of an axis of K, 
the field consists of n; particles (photons or mesons) of the sort 1, 
ne particles of the sort 2, and so on, whereas the material system (for 
simplicity, one electron or one nucleon only) is in one of its stationary 
states. This means that any axis of K can be characterized by a 
sequence 71N2 * - - of numbers which define the state of the field 
and a number a which determines the state of the material particle. 
To simplify the notation we shall write n instead of ning -- - , in 
this way distinguishing the different states of the field by numbers n, 
each one of which represents a certain sequence nyng -*+*. Then 
the axes of K can be characterized by 1a, or 2b, and so on, 1a meaning 
that the field is in the state nyn2 - - - that corresponds ton = 1 and 
the particle is in the state a. 

Now let us take into account the interaction H’. Whereas the 
other two terms of the Hamiltonian H are diagonal in K (defined as 
the principal system belonging to the energy of the unperturbed 
system), H’, by the quantization of y, transforms into a matrix which 
is a linear function of A; and B; and is noé diagonal in K but consists of 
elements connecting a field state n with another state n’ which differs 
from n in that it contains one photon or meson of some kind more or 
less than n. That this is so can be seen at once from the expressions 
of (222) for the matrices A; and B;. For example, if the states la 
and 2b satisfy this condition, an element is contained in H’ which 
mediates a transition by means of which the particle changes its state 
from a to b, simultaneously emitting or absorbing a photon or meson. 
Therefore a certain wave vector k belongs to any element of H’ 
which, by ik, determines the momentum of the particular photon or 
meson with which the element is concerned. 

Here arises the question of how the expression for the interaction 
H’ is to be altered in order that the divergences that occur in the 
traditional theory may disappear. This alteration, aside from the 
purpose for which it is designed, must fulfill two conditions: 


(i) It must be relativistically invariant, that is, H’ as a part of the 
Hamiltonian possesses certain transformation properties, which proper- 
ties must remain unchanged. This requirement is satisfied if H’ is 
changed in such a way that it does not depend on the choice of the 
coordinate system. 

(ii) H’ must remain Hermitean; otherwise it would lose its physical 
interpretability. 


Src. 69] INTRODUCTION OF ly 271 


We comply with these requirements by multiplying every element 
of H’ by an invariant factor determined in the following way. We 
consider some element of H’ which may connect the states na and 
n'b, that is, the element H’na,n». The space-time coordinate system 
used may be denoted by K, and the wave vector of the photon or 
meson emitted or absorbed in the transition n — n’, when referred to 
K, may be denoted by k. The reference system K is now exchanged 
for that other system K’ in which the state a of the electron or nucleon 
is transformed into a state of rest. The meaning of this is clear if 
the particle is free; for an electron or nucleon bound to an atom or 
nucleus, the meaning is that the center of gravity of the atom or 
nucleus is at rest relative to K’. The vector k is then transformed 
into a certain vector k’. Around the particle in the system K’ we 
construct a sphere of diameter /) and evaluate the expression 


_| | dv e*’*| 


rather afee 


in which the integration is to be extended over the volume »v of the 
sphere. In the denominator the value of e*’* at the center of the 
sphere must be taken. The value of x; is evidently unity for [k’| = 
Qn/d! K W/o and zero for |k’| >> 2r/lo. In a second transformation 
we pass from K to that system K’’ wherein the state b becomes a 
state of rest and evaluate the expression 


a dy e*"""| 


K2Q ee Se | 


Finally we combine «x; and x2 and “a 


fifa] | f does) 
5 Ce + trea 3 (x1 + ke) (392) 


ae v|e*| ole ‘ 


This is an invariant factor the value of which does not depend on the 
coordinate system K, since we have to pass from any system K to 
the same systems K’ and K’. The value of « lies between 0 and 1; 
it is zero if the wave vectors k’ and k” correspond to wavelengths 
\ and ”, both of which are < lo, and it is equal to unity if \’ and 
” are > Io. But x may have any value between these two limits; for 
example, it has the value 4 if one \ > lp and the other A < lo. 
The theory now is that any element of the interaction matrix H’ 
must be multiplied by the corresponding x. By this procedure we 
change H’ in such a way that the invariance of the theory is not 


272 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cu. 10 


destroyed, since the new H’ matrix differs from the original only in an 
invariant pattern imprinted on it. This pattern consists of the x 
factors and remains unchanged upon transformation to another 
coordinate system, the effect being that H’ retains its transformation 
properties. It is clear at once that the correction is gauge invariant 
also, since the vectors k’ and k’” from which « is evaluated are inde- 
pendent of the choice of the potentials ¢and A. It is seen furthermore 
that the Hermitean character of H’ is conserved since the correction 
factor x for two elements of H’ which are symmetric relative to the 
diagonal of H’ is the same. To achieve this we had to choose a 
correction factor x which takes account of both the initial and final 
states in the same way. 

The pattern of the x which is imprinted upon H’ evidently limits 
the effectiveness of extremely small wavelengths so that the transitions 
that have to be admitted by the present theory can occur no longer. 
This effect is due to the universal length Jo, which plays the role of a 
limiting constant. The integrals by which the « factors are defined are 
not to be understood in the sense that by them we would ascribe an 
extension in the ordinary sense of the word, that is, a space filled with 
matter to the particle. We have made it clear already that such an 
interpretation could not be reconciled with the view of relativity 
(Section 60). For this reason the concept “extension of an elemen- 
tary particle’’ must be given a new definition, and we do this by 
means of the x factors. We forego any interpretation of these factors, 
being content with the proof that the altered formalism gives a correct 
description of the experimental facts. 

70. Application to Electrodynamics. In order that the sug- 
gested scheme be useful it is essential that it leave unchanged those 
results of the present theory which have proved to be correct, or at 
least change them to so small a degree that it escapes detection. 
On the other hand, it must involve drastic alterations where the theory 
openly contradicts the facts. For example, the theory would be dis- 
proved if the length lo, which is of the order of magnitude 10~™* cm, 
should prevent photons of a wavelength less than /o (of an energy above 
10* ev) from interacting with electrons, for, actually, the phenome- 
non of cascade showers can be understood only on the assumption that 
photons of at least 101! ev must still be effective. On the other hand, 
a limiting wavelength of the order of magnitude of 10~ cm really 
seems to exist for mesons, for there is experimental evidence that 
mesons with a de Broglie wavelength very much smaller than lo are 
neither emitted nor absorbed by nuclear particles. It is most essential 
that the theory account for these seemingly inconsistent facts. 


Sxc. 70] APPLICATION TO ELECTRODYNAMICS 273 


First let us consider the interaction of a photon with a free electron. 
According to (341) the probability of the absorption of a photon is 
determined by the matrix element 


EK cnitnetl = —12e V5 Vn / va*(aus)ve (341) 


In this, ¥_ and y signify the states of the electron before and after the 
absorption and can be represented in the form 


Ya = Ga(p)e?o"* Y = ay(p)e"?*”* 


Qq(p) and ay(p) being functions of the spin coordinate p, and p, and p, 
the momenta of the electron in the states a and b. In addition (ef. 


Section 54), we have 
4ne? 
ee Var ec** 


where e denotes a unit vector which determines the direction of polari- 
zation and k the wave vector of the photon. Expression (341) 
then becomes 


; h = |4c? - ; a 
HH’ ansom—1 = "86 2, ar V na*(ae)ay et Pat Py kA)/A 
t 


The integral differs from zero only if p, — pa = kh; this means that 
only those transitions are possible in which momentum is conserved. 
There is, however, no limit to [3k = 2r/\, and thus it should be 
expected that any light, however small its wavelength may be, can be 
absorbed. We now determine the correction factor x by which 
H’antbn:—1 is to be multiplied. For the sake of simplicity it can be 
assumed that the electron is originally at rest so that pz is zero. As 
the system K’ is then identical with K, we have 


| | dv e** 
»| ek *| 


LS ee er ee 


For light with a wavelength very much less than Jp this fraction is 
zero. On the other hand, the factor x2 does not vanish for any wave- 
length, for in order to evaluate x2 we have to pass to the K”’ coordinate 
system in which the electron which is moving with a momentum py, 
relative to K is at rest. If we choose the direction of p, as the z axis, 
K" moves along the x axis with a velocity v defined by 


274 INTRODUCTION OF A FUNDAMENTAL LENGTH (Cu. 10 
my h 
DPo.>= = hk ==— 
V1 — w/e? r 
and thus 
ROT SOPRA Fai, AL Maa. Sire 
c M2 + h?/m2c? ec 1 + A? /m*c?n? 


According to a well-known formula of relativity theory, the wavelength 
\ is changed by the passage K — K” into 


1+ 6 cos 6’ 
V1 — 6? 
where @’ is the angle between the direction in which the light is propa- 


gated and the z axis, the measurement being made in the system K”’. 
Since in our case 6’ = 0, we get 


=X (398) 


Ne es = 2 


h/me is of the order of magnitude 10~'° cm. Therefore \”’ is very 
much greater than lo, so that in any case xz has the value unity. Thus 
the effect of lo on the absorption of a photon by an electron which is 
initially at rest is that the corresponding matrix element for \ > Ip 
remains unchanged, but for \ < lp it is reduced to one-half the classical 
value. 

On emission, the effect is different. Here, again, the matrix ele- 
ment remains the same for \ > lo, whereas for \ KJ» both x; and xo 
vanish, for, through the emission, the electron gets a recoil in the 
opposite direction, in consequence of which ’’ becomes smaller than 
\ since now A” = A(1 — B/ V1 — 6?) < d. According to our theory 
an electron at rest should, therefore, be incapable of the virtual 
emission of a light quantum with a wavelength very much smaller 
than lo. As we shall see, this secures a finite value to the transverse 
self-energy of an electron. 

As an application of the theory, it may be sufficient to consider the 
Compton effect. This effect concerns the scattering of a light wave 
by a free electron; the electron absorbs a wave of frequency v and in 
return emits another of frequency v’ which in general differs from ». 
According to quantum mechanics this process is accomplished in two 
steps: the electron first changes from its initial state A into an inter- 
mediate state Z; by absorbing the light quantum hy, whereupon it 
passes to the final state F by emitting the quantum hv’. But the 


Sxc. 70] APPLICATION TO ELECTRODYNAMICS 275 


transition A—F can as well take place so that hv’ is first emitted 
(intermediate state Z:) and then hv absorbed. As in all transitions, 
momentum is conserved and, besides, the final energy must equal the 
initial: 


hy hy’ 

peers pas eae / 94 

Fin ve) EB (394) 
hy + me? = ho’ +e V mc? + p? (395) 


e and e’ are two unit vectors indicating the directions of the absorbed 
and emitted photons respectively, and it further is assumed that the 
electron is at rest initially, moving afterward with the momentum p. 
If we denote the angle between e and e’ by 6, from (394) it follows that 


h? 
p= 2 (v? + vy’? — 2vv’ cos 8) 


By introducing this expression into (395) we get the equation 
‘ me? 1 
7” me? + hv(1 — cos 4) regi hy oe ee 
I> —;3 2am" = 
me 2 
from which, for a given v and 6, the frequency v’ of the scattered light 
wave can be computed. 

It can be seen from (396) that the change »—»’ increases with increas- 
ing 6. If the primary radiation is of an extremely short wavelength in 
the sense that hy >> mc?(= 4% mev), two regions of @ have to be 
distinguished. For sin? 6/2 < mce?/h», that is, for a very small angle of 
deflection, v’ is equal to v. However, for a finite 0, (hv/mc*)2 sin? 0/2 
> 1, and therefore v’ = mc*/(2h sin? 0/2), that is, the wavelength of 
the scattered light then becomes of the order of magnitude h/me, 
the Compton wavelength, \) = 0.4 X 107'° cm. 

These relations are, of course, not changed by the « correction factors 
since they do not influence the process of scattering as such but only 
the probability of its occurring, which is determined by the elements of 
the H’ matrix. According to Section 37 this probability is given by 


dW = = p(w)|H|* dQ 


where 


! , , , 
jie Haz Hayy si Haz, Har (397) 


o €4 — €, + hy €4 — €z_ — hy’ 
1 


276 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cu. 10 


Here dW means the probability that in unit time a photon hy is 
scattered in such a way that the direction of the scattered photon lies 
within a solid angle dQ. a, €z, and ez, are the energies of the electron 
in states A, Z;, and Z2 respectively. The summation is to be extended 
over all the states having the same p’ and p”; there are four such states 
for each momentum, differing in the direction of spin and the sign of 
the energy. By evaluating (397)+ we obtain the differential cross 
section, which is given by the Klein-Nishina formula 


\ 2 , 
a = + rt aa(*) (542-244 cos») 
4 v v v 


For } > lo, this formula is not changed by lp. When A Xl, however, 
H'az, is diminished to half the value, for, according to our considera- 
tions regarding the absorption of a light wave, xj = 0 and xo = 1. 
For the transition Z;— F we must distinguish the regions 6~0 
and @#0. In both cases the electron in the state Z, moves with the 
velocity acquired in the transition A— Z, by the absorption of a 
primary photon and therefore is given by mv/V 1 — 6? = hv/c, from 
which, since v ~ ¢, it follows that V1 — 8B? = me”/hv = X/Xo, where 
Ao denotes the Compton wavelength and d that of the primary radia- 
tion. The K’ system moves together with the electron, and therefore 
the wavelength \’ measured in K’ is given by \’ = \4(1.+ 8 cos 6’)/ 
V1 — 6”, where by \1 we denote the wavelength of the scattered 
photon as measured in K’. 6’ is the angle of deflection in K’ which, 
according to a formula of relativity theory, is related to the angle @ 
measured in K by 

cos 6 — B 


¢ = ——— 
ta 1 — B cos 6 


(398) 
Since the denominator is positive, this fraction provides a positive 
cos 6’ only for cos 6 > 8 or 6<V2V1 —Br~V1— p? =d/ro. 


Then we get cos 6’ = cos 0 = 1. Besides, we have 


and, therefore, according to the preceding considerations, \; = X. 


Hence we obtain \’ = \(2/V'1 — 6?) = 2A) > Io and, therefore, 
ki = 1. For @ > X7/do, however, cos 6’ ~ (cos 6 — 1)/(1 — cos @) = 


} Cf. W. Heitler, Theory of Radiation, Oxford University Press, 1948. 


Sec. 70] APPLICATION TO ELECTRODYNAMICS 277 


—1and Aj, for a sufficiently large @, increases to the value Xo. There- 
fore we obtain 


1 — r nN 
N = hy GGT VIB a <i “  =0 


In order to evaluate x2 we must determine the wavelength )’” 
measured in K’’. K”’’ is the coordinate system in which the electron 
is at rest in its final state. For @ = 0, practically the whole energy is 
transferred to the scattered photon, in consequence of which, relative 
to K, the electron is at rest. Thus we have 


K” =K NM” =A Ko kg =0 


For @ # 0, relative to K, the electron is moving with nearly the velocity 
v mentioned above, since the energy of the scattered photon is negligi- 


ble. Since Ay = Ao, cos 6’ = —1, we have 
1-p » 
Ua Me yon 2 re acs Ba focal Fa as 
; gt per — 6? Shes i 


Thus the result is that the matrix element Hz,’ in (397) is to be 
corrected by « = %, and the element Hz,r’ by either x = 44 orx = 0 
depending on whether the scattering angle is nearly zero or has a 
finite value. In any case the second sum in (397) vanishes, since we 
have seen that x = 0 for the emission term H4;,’.. Thus a photon with 
 < ly can be scattered only in the forward direction with a probability 
that is diminished to 4 that of the Klein-Nishina value. The other 
possibility of a deflection by a finite angle is excluded by the correction 
factors. 

In agreement with our thesis, it can be inferred from this result 
that it is impossible to distinguish the positions of two particles by 
means of a diffraction experiment performed with light rays if the 
distance between them is less than Jp. For such an experiment, a 
radiation would be required with a wavelength the order of magnitude 
of which would correspond to the distance to be measured. If the 
two particles are fixed in space, they cannot react upon a photon with 
a wavelength very much smaller than / because of the « factors which 
make the corresponding matrix elements vanish. If, however, the par- 
ticles are free, a photon with such a wavelength can only be deflected 
by an angle @<d/Ap. On the other hand, a diffraction pattern can 
only be had if there is a deflection angle @ for which asin 6 = A or 
6 = d/a, which for the case a < lp is > A/Xo. 


278 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cu. 10 


A second remark concerns the zero-point energy of a cavity radia- 
tion. In the present theory, this energy diverges because of the infin- 
ity of wavelengths which have to be considered in the evaluation 
of the energy. Actually, however, waves with \ <lIo cannot be 
reflected from the walls of the box in which the radiation is enclosed, 
as neither free nor fixed electrons are able to deflect a photon by a 
finite angle. Thus in Section 56 the summation (314) is only to be 
extended over a domain of wavelengths limited by lp. 

71. Bremsstrahlung—Transverse Self-Energy of an Electron. 
It has already been mentioned that the phenomenon of cascade 
showers can be accounted for only on the assumption that photons of 
any energy can be emitted by correspondingly fast-moving electrons 
which are stopped by anucleus. Therefore it is to be required that the 
new theory be in accordance with this assumption. In order to show 


this we consider an electron, the energy Hy = mc?/V 1 — 6? of which 
may have any value, moving against a nucleus. When passing 
through the field of the nucleus, it will be deflected and simultaneously 
it will emit a light quantum fy. The transition of the system from 
the initial into the final state is accomplished by means of an inter- 
mediate state Z, two different Z states being possible; either the 
electron is first deflected (A — Z) and then the transition (Z — F) 
occurs, whereupon the quantum is emitted, or the two processes occur 
in the reverse succession. In any case the possibility of the transition 
A — E depends on the matrix element Hz’ or Hz,’ which corresponds 
to the emission of light by a fast-moving electron. We denote the 
momenta of the electron before and after the emission by po and p and 
assume that the initial energy Hy is very much greater than hce/lo, so 
that by the stoppage of the electron a light quantum of a wavelength 
very much smaller than /o is created. The correction factor x; depends 
on the wave length \’ of the quantum relative to the system K’ moving 
with the initial velocity vo of the electron. The velocity vo is defined 
by mvo/V 1 — Bo” ~ mc/V 1 — Bo” = po, so that from (393) we 
obtain 


For p = 0 (complete stoppage), the whole energy ¢ V mc? + po”? — 
me* ~ Ep is transformed into a quantum hv = he/\. We obtain then 
\’ = 2d(he/Amc”) = 4p, and therefore «; = 1 whereas x2 = 0. 
Thus the matrix element Hz’ or Hz,’ is then effective with only half 
its value, so that the probability is diminished to 14. If, however, the 


Szc. 71] BREMSSTRAHLUNG 279 


electron is not completely stopped, x2 may also be equal to unity, 
so that the value of x becomes unity. This is the case if \’’ = 2d/ 
V1 — 6? is greater than Jo, 6 now corresponding to the final velocity 
of the electron. From the energy law it now follows that 


eV int? + pot — 0 Vn +p = ty = 
or, if we assume that FE = ¢c V m*c* + p?, as well as Eo, is greater than 
he/lo, then 


nee MA. 
Pp pore 


On dividing this equation by \/V 1 — 8? > Io or by Ap/me > lo, 
since mc/V 1 — 6? = p, we get 
PO a h aD Xo Lo a, —8 
ee or D> My. Po X 10 
This means, for example, that an electron of 101 ev is slowed down to 
10° ev. The probability of the creation of a light quantum by such a 
process is not changed by lo; therefore we can hold that the theory of 
Bremsstrahlung is practically independent of lo. 

In the preceding considerations we have disregarded the possible 
influence of Jp on the deflection of the electron caused by the Coulomb 
field of the nucleus. This influence cannot be treated with the aid of 
the developed theory, because the Coulomb interaction does not fit 
into the mathematical scheme on which the evaluation of the x is 
based. If the momenta of the two colliding particles are given by 
PioP20 and pipe respectively, the interaction V is a certain function of 
these four momenta which can be transcribed into a matrix. The 
procedure again is to furnish the elements of this matrix with invariant 
correction factors the meaning of which is that a diffraction experi- 
ment, carried out with material particles, proves as much of a failure 
in measuring a distance less than Jp between two particles at rest as 
the diffraction of light waves. We omit here the details of the theory 
and find that a particle, when colliding with another of great mass, 
owing to lo is not able to transfer a momentum Ap on the latter of an 
amount greater than h/lo. This is the reason why particles with 
momentum greater than h/lo cannot be reflected from a solid wall, 
since the reflection would be associated with a change of momentum 
that is excluded by lp. 


280 INTRODUCTION OF A-FUNDAMENTAL LENGTH = [Cz. 10 


The influence of the constant Jy on the Bremsstrahlung resulting from 
the deflection is unimportant, consisting only in confining the emission 
to the forward direction. This direction preference is partially due 
to the conservation of momentum; hence it has already appeared in 
accepted theory. The effect is intensified, however, by Jo because 
lo limits the amount of momentum transferred to the nucleus. 

Hence it turns out that, in the applications of electrodynamics to 
practical cases, the effect of the constant lo is of little consequence and 
may often be completely neglected. The only conspicuous effect is 
that it settles the divergence difficulties. We have already seen that 
the zero-point energy of a cavity radiation becomes finite, for the wave- 
lengths are limited by lo. This limitation also settles the divergences 
that confront us in the higher approximations of the perturbation 
theory (ef. Section 60). For example, consider the transverse self- 
energy of an electron at rest. This is given by 


P 5 J Vo*(aur)vo J Yo *(aur)vo 
E® = = a (346) 
2 me” — (Ey + hy) 
t 

The term is evaluated on the assumption that no photons are in the 
field initially. The integrals in the sum are the matrix elements of 
the interaction H’ between the field and the electron, and they cor- 
respond to transitions in which the electron passes from rest to motion 
or conversely, simultaneously emitting or absorbing a photon. In 
these virtual processes momentum is conserved, that is, by the emission 
of a photon the electron obtains a velocity in the opposite direction. 
Consequently the factors x; and x2 vanish for \ < lp so that wave- 
lengths less than Il) cannot contribute to the self-energy. Thus the 
summation in (346) must be extended only over those wavelengths 
which are greater than Jo, that is, over a finite number of terms. 

Although the Ip theory removes the divergence of E™, it does not 
lead to the correct value of the magnetic self-energy of the electron 
which is of the order of magnitude mc”, for in working out E wearrive 
at about 50mc?. As the sum of the approximations EZ, ZH“, and so 
on, converge very slowly, an agreement of 2 with the experimental 
value is not to be expected a priori. 

72. Application to the Nuclear Forces. Although the influence 
of the limiting length J) on the electron is confined essentially to remov- 
ing the divergences arising in the theory, the situation is quite different 
in the theory of the meson. The reason why lo, at least practically, 
plays such an unimportant part as far as the electron is concerned 


Src. 72] APPLICATION TO THE NUCLEAR FORCES 281 


(except for the occurring divergences, the incorrectness of the theory 
would not have been noticed) is that the electron because of its small 
mass is able to cope with any wavelength however small. In contrast, 
a heavy nuclear particle on which a radiation of extremely small wave- 
length is acting is, because of its great mass; hindered from taking on a 
velocity sufficient to give the factor x a value considerably different 
from zero. This must lead to phenomena which cannot be explained 
without the help of Jo; for this reason in Chapter 9 we had to resort to 
a “cutting-off’’ procedure wherein wavelengths less than lp are 
cancelled. However, we shall see in Section 76 that it is not always 
the short wavelengths with which the theory is unable to cope, but 
that there are phenomena in the domain of quite normal energies as 
well which cannot be accounted for without the help of Io. 

We investigate first the influence of J) on the potential of the nuclear 
forces which were determined in Section 64. According to (372) and 
(373), the potentials arising from the exchange of longitudinal and 
transverse vector mesons are Pits by 


Vi = ee fafa are Lite (399) 


oik) (ook 
ear fa af aay Pane “|e » — Be — 


In these expressions k multiplied by 4 denotes the momentum of the 
exchanged meson, and the integrals relative to k = 2r/d are to be 
taken from 0 to , the result being that Vu diverges for r = 0 like 
1/r*. This is prevented by the limit which Jp imposes on the spectrum. 
We may assume that the two nucleons are fixed in space by the forces 
binding them to the nucleus so that they are unable to get into motion 
by the emission or absorption of a meson. The correction factor x 
then becomes zero for any k = 2/d which is greater than ky = 2r/lo, 
and thus we need integrate only from 0 to ko. This also holds if the 
particles are assumed to be free, since, in this case in which no mesons 
are initially in the field, the field can be changed only by the emission 
of a meson and this cannot occur for a wavelength less than Jp. The 
potentials, therefore, now remain finite for r = 0 and V; assumes the 
value 


Vu 


2, [ke 4 
gk k 
¥x(0) rc Jo i x? + k? 


2 k k 3 
_ z- (e arctan — — «ko + — 
TC kK 3 


282 INTRODUCTION OF A FUNDAMENTAL LENGTH = [Cu. 10 


For r = 0, the value of Vi: depends on the directions of the spins. For 


parallel spins, in which xs) = 1, we get 


Qf? ko ket 
Vu(0) = — aah (@ arctan — — — «ky + — 
For antiparallel spins, (x0) = —3, so that the factor —2¢ is replaced 
by +58. 


In order to evaluate the potentials (399) and (400), we introduce the 
polar coordinates @ and ¢, taking the direction of r as the polar axis. 
We obtain 


ko k* 
[ao fae = 2 |” ep eet 


ae sin kr 
=a [ak r cae kr 


If we assume that the spins are parallel or antiparallel to r, we obtain 


be kt og. (ork) (2k) 
[af i BORE ke 


ko k* cm ; 
+2Qn i dk —-——. ‘i dé sin 6 cos” be%*"  ° 


w+ K + k? 
he k* (= kr sin kr cos ) 
: taf Ba eNo me S2 Geh Fe 


For spins perpendicular to r we obtain half of this value. Using the 
abbreviations 


no= [asp 10- fesealGr ar) 
we obtain 
Vo) = “Br [o°Jx(r) + 2f°J2(r)] for (ei03) = 1 (spins || to x) 
VO) = = Gag lG? + Pie) — FIO 

for (ci02) = 1 (spins 1 tor) 
Ve) = ~ Gag lG? — Pal) — YC] 


for (ies) = —3 (spins | to r) 


Sec. 72] APPLICATION TO THE NUCLEAR FORCES 283 


The functions J;(r) and J2(r) can be worked out by a graphic method. 
They are plotted in Fig. 5, where 1/x is supposed to be less than 1/2. 
It is very remarkable that in this case, according to Fig. 5, the range of 
the nuclear forces is given by Io rather than by 1/x. It is readily 
seen that this is due to the interference of the waves sin kr and cos kr 
over which the integrals J; and Je are to be taken. Each of these 
integrals may be interpreted as representing the excitation caused by 
a wave packet at a distance r. All the waves, sin kr and cos kr, 


k, 
lehiie dy: hae Se 


ad Luni fake k*_ [sinkr _ coskr 
2=h(r)= dk ais (EL - SE 


1/4 1/2 31/4 & 


of which the packets are composed start from particle 1 with the same 
phase 0, and their combined effect on particle 2 depends on the phase 
differences with which the waves arrive at point 2. For a given 
distance r, we may distinguish waves with 4/2 >r and those with 
4/2 <r. The waves of the first kind operate at point r in the same 
direction so that they reinforce one another, while the latter, which 
meet in all possible phases, have a zero effect. This means that with 
decreasing r an increasing number of waves cooperates in the produc- 
tion of V. But, as the wave spectrum ends at a wavelength dA = Io 
at the distance r = 1/2, all the available waves are already in action, 
so that with any further diminishing of r the potential can only be 


284 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cu. 10 


changed as far as amplitudes are concerned and not by the number of 
waves. Conversely, from r = 1)/2, with increasing r the potential 
must diminish rapidly, since, for example, at the distance r = lo, 
only those waves with \ > 2lo are effective, and the number of these 
is eight times smaller than that of the waves with \ >/o. In this 
case the potential diminishes to about )¥ its initial value, and at the 
distance r = 2/9 it will be about 44 of the initial value. 

So we see that for a sufficiently small 1/x(< 19/2), the range R of 
the nuclear forces is defined by /o/2 rather than by 1/x. It is different 
if 1/x > 1/2. In this case Jo loses its significance, and its place is 
taken by 1/x, which now represents the range R. But, as the value of 
Io is probably about 5 X 10~** cm, the condition that 1/« be greater 
than /)/2 would demand a meson mass mp less than 150 electron masses 
m, whereas actually mp ~ 200m. Therefore it must be inferred that, 
in contrast to the generally accepted view, the range FR of the nuclear 
forces has nothing to do with the mass mo of the meson but is con- 
nected with the fundamental length Jp. This result explains the rela- 
tion of R to the classical electron radius r9, which up to now had to be 
taken as a merely casual coincidence. In reality, both R and ro are 
closely related to lo. As regards 79 this can be seen when the x cor- 
rection factors are introduced into the quantized field equation (330) 
in which terms with the Dirac function 6(P — P,) occur. These 
functions, which are zero for P # P,, correspond to the point character 
of the particles and are changed by the « factors into functions that 
differ from zero within a region around P,, the extension of which is of 
the order Jp. In other words, the particles appear extended in our 
theory, although we must not think of the extension in the usual sense 
of the word because its size is not fixed but depends on the experiment 
by which it is measured. An extension of this kind was unknown to 
classical physics, which imagined an electron as a sphere of constant 
radius 79. We may, therefore, consider rp an inadequate anticipation 
of the length J) and consider as an important success for the theory 
the fact that it is able to explain the coincidence of R and ro as a deep- 
rooted relationship. 

73. Nuclear Scattering of Mesons. Magnetic Moment of 
Proton and Neutron. In Section 65 we saw that, in the case of 
the scattering of mesons through nucleons, the vector theory, on the 
assumption of point-shaped particles, leads to a cross section which 
increases rapidly with increasing energy and, when compared with the 
measurements in the region of 10° ev, proves to be from one thousand 
to ten thousand times too great. The pseudoscalar theory is in better 
agreement with experiment, although its results, being ten to one 


Sc. 73] NUCLEAR SCATTERING OF MESONS 285 


hundred times too high, are still unsatisfactory. Since we are not 
sure whether: the observed mesons are vectorial or pseudoscalar, at 
present it is impossible to decide whether the lp principle is able to 
improve the theory. If the mesons were certain to be vectorial, the 
question must be answered in the affirmative, for, as we shall see, 
in that case the cross section is diminished by Jp to the correct value. 
For pseudoscalar mesons, however, the corrected cross section would 
be far too small. At the present stage the scattering of mesons, 
therefore, cannot be used as an argument for or against the Jp theory, 
the less so since the scattering process may be associated with a 
transmutation of the mesons. Notwithstanding this fact, it may be 
useful for further investigations to know the influence of Jp on the 
probability with which a meson of high energy is scattered. 

According to Section 65 the process of scattering is achieved in two 
steps. The nucleon first absorbs a meson, thereby passing from its 
initial state A into an intermediate state Z, whereupon another meson 
is emitted in the transition Z—>/. The other possibility that the two 
processes take place in the reverse order may be disregarded, since a 
nucleon at rest is unable to emit a meson with a momentum greater 
than h/lo. In order to calculate the correction factor « which belongs 
to Haz’, we must take into account the recoil the nucleon receives by 
the absorption. If the momentum of the absorbed meson is given by 
h/, where d is supposed to be less than Io, the conservation of momen- 
tum requires that 


ears = : (m = mass of the nucleon) 
and hence 
pee Aer Pilore out! wok od toca A no 
V1 + h?/m2c2nx? =—-V1 + (1p /8A)? 


since h/me ~ 1/3 if we assume the value 5 X 107} for lo. 

For the wavelength \” into which \ is transformed by the passage 
from the original coordinate system K to the system K’’ in which the 
nucleon is at rest, we get 


1 + B cos 6” 2 . (@) 
N= = 0S ~~ 2 r + Ps 
V1- 6 V1 — 3 


Thus, for a wavelength very much less than lo, we have X’’ = 21/3. 
This means that / dv e*’* is different from zero but has a small value 


286 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cu. 10 


only, since a considerable part of the wave motion is ineffective. In 
Code’s experiments the energy of the mesons was 0.8 X 10° ev, cor- 
responding to a wavelength \ = 1.7 X 107" cm; for \” we would then 
have 0.919. Using this value, we must determine the value of x2 from 


fae 


ee 


2 = r| 


using in the denominator the value of e*”* at the center of the sphere. 
By a graphical method we find that x2 ~ 14; hence x = 34(«1 + x2) ~ 
¥ since x, = 0. 

The correction factor for Hzr’ can be determined in a similar way. 
Here x2 = 0, while x; is given by 


| dv e*’* 


ap em’ 


o| 
In the above, k’ is the wave vector of the emitted meson when it is 
observed in the system K” of the preceding calculation. k’ corre- 
sponds to a wavelength X’ given by 


1+ 6 cos 0’ 
V1 — Bp 
if we assume that no energy is lost by the scattering so that the 
absorbed and emitted mesons have the same wavelengths. 6’ is 
the angle between the directions defined by the incident and the 
emitted meson. The angle 6’ is measured in the system K”, and for 
x, * 0 it must satisfy the condition 1 + 8 cos @ > 0 in order that )’ 
become greater than }. This means that the angle 6, measured in 
the initial system K, must be less than mcd/h, which is nearly equal to 
3d/lo. Then x again becomes 1, and the probability of the transition 
'A—F, which is proportional to the square of H4z’Hzr’, must be 
multiplied by x‘, that is, by 1/8*. The cross section for vector mesons 
of 0.8 X 10° ev is then reduced to a value of 0.3 X 10~*” em, which is 
in good agreement with the result, 0.6 X 107?’ cm, of the measure- 
ments. This would be satisfactory if there were not the possibility 
of the mesons being pseudoscalar. Then the same factor, 1/8* would 
have to be applied, with the result that the cross section would become 
far too small. 

There is no such ambiguity in the treatment of the magnetic moment 
of the proton and the neutron. According to Section 66 this moment 
comes about by the emission and reabsorption of mesons by the 


Nv =X 


Sec. 74] DECAY OF NEGATIVE MESONS IN LIGHT ELEMENTS 287 


nucleon, only vector mesons being involved since the moment is due 
to their spin. Since only mesons with a momentum less than h/l can 
be emitted by a nucleon at rest, the integration in (382) relative to k 
is only to be extended from k = 0 up to near ky = h/lp. This limita- 
tion of the integral is brought about by the correction factor x, With 
the square of which the integrand must be multiplied and which, in 
the neighborhood of kp = h/lo, decreases to zero. The effect of x can 
be evaluated by means of a graphic integration, leading to the result 


be = 2.4y9 by = —1.4Apo (uo = nuclear magneton) 


which as a first approximation are in rather good agreement with the 
experimental values 


Me = 2.70 Bw = —1.9u0 


74. Decay of Negative Mesons in Light Elements. It is in 
the nature of the length J) that it generally manifests itself only in 
experiments dealing with particles of extremely high energy. But, 
recently, a very striking phenomenon in the region of quite normal 
energies has been observed. It is inexplicable for the current theory, 
the only interpretation permissible being that in the interaction 
between nucleon and meson a limiting constant must be at work. In 
the experiments of Conversi, Pancini, and Piccionit the behavior of 
mesons traversing a layer of matter and being stopped by it was 
studied. When, by the collisions it experiences, a negative meson is 
sufficiently slowed down, it is captured by an atom and takes on a 
motion of revolution around the nucleus. After a very short time it 
reaches the K orbit, and we should expect that it would be absorbed 
from this orbit by a proton of the nucleus. We can calculate the 
probability of the absorption and find that it must take place in an 
interval of the order of magnitude 10~'* seconds which is shorter by 
about 10~™ than the 10~° seconds required by the meson for its 
decay into an electron and a neutrino. Therefore it should be impossi- 
ble to observe the appearance of decay electrons. However, experi- 
mental results are in sharp contradiction to this expectation. It is 
true that negative decay electrons are not observed in iron, but, if 
carbon is used as a stopping material, the number of the negative 
disintegration electrons becomes nearly the same as that of the 
positive ones. This means that, for carbon, the absorption probability 
of a negative meson must be smaller by at least a factor 10!” than we 
should expect according to the theory. 


1M. Conversi, E. Pancini, and O. Piccioni, Phys. Rev., 71, 209 (1947). 


288 INTRODUCTION OF A FUNDAMENTAL LENGTH  [Cz. 10 


As Sigurgeirssont was able to show, the result holds not only for 
carbon but also for other light elements with an atomic number Z 
which is less than 10. With increasing Z the absorption probability 
gradually increases and reaches the theoretical value when Z is suffi- 
ciently high. 

A discrepancy between experiment and expectation by a factor 10” 
creates a very serious situation for current nuclear theories. It is 
important, therefore, that we come to a very simple explanation of the 
effect by taking into account the length Jp. Let us assume a pseudo- 
scalar meson which is moving on the K orbit. We denote the wave 
function of the meson by y and that of the proton belonging to the 
nucleus by ®. Then, according to (360), the interaction of the meson 
y and the proton %, apart from an irrelevant factor, is given by 


| (6*oo) grad y + conjugate complex (401) 


The problem now is the interpretation of this expression. The 
current theory considers @ a function of zyz from which can be evalu- 
ated the probability that the particle, represented by ®, is found at 
point xyz when an exact measurement of the position is made. But 
this interpretation is inadequate to the possibilities of a real measure- 
ment. Actually there is a limit to the accuracy with which a position 
can be ascertained, and we must give the function @ a meaning that 
takes this limit into account. This is done by defining ® as a function 
which determines the probability that the particle is found in coinci- 
dence with a reference particle to which point xyz is attributed. By 
coincidence we mean that the two particles, because of a separation 
less than lo, cannot be distinguished from each other. This definition 
attaches an observational meaning to the function without destroying 
its continuous character. Although the function remains formally 
the same, it now refers to a point xyz in a way that corresponds to the 
possibilities of observation. We must not consider the ascribing of 
definite coordinates xyz to the reference particle a difficulty which 
seemingly contradicts the principle that an exact measurement of 
position is impossible, for here our procedure is to be understood in 
the sense that by means of the reference particle we define the coor- 
dinate system. 

Let us now take as the reference particle that proton Po of the 
nucleus which is situated at the center of the K orbit and to which we 
attribute the position r= 0. The Po is represented by a function 


{ T. Sigurgeirsson and A. Yamakawa, Phys. Rev., 71, 319 (1947). 


Szc. 74] DECAY OF NEGATIVE MESONS IN LIGHT ELEMENTS 289 


® which differs from zero only in the immediate neighborhood of r = 
0, so that in (401) we may put 


(@*cb) = ¢ 4(r) (402) 
4(r) being the Dirac function which vanishes for r ~ 0 and diverges 
for r = 0 in such a way that | dv 5(r) = 1. Conversely, there is, 


according to the current theory, one and only one particle to which 
equation (402) applies, so that by 


[cae grad y 


the interaction of the considered meson with only one proton, situated 
at r = 0, is described. But, according to our interpretation of the 
function @, (402) applies to any particle which is in coincidence with 
Po, that is, which is separated from Po by a distance less than Ip. By 
introducing (402) into (401) we get the interaction of the meson with 
all these protons. Then (401) becomes 


o(grad Y)o 


the subscript 0 meaning that grad y is to be taken at the point r = 0. 
Now the function of the K orbit depends only on r and therefore, at 
the point r = 0, grad y has the same intensity in any direction and 
thus (grad ¥)o = 0. This means that a nucleus, consisting only of 
protons which are all in coincidence with one another, is not able to 
absorb a pseudoscalar meson. An absorption other than zero is 
possible only if the protons of the nucleus do not form a spatially 
indissoluble unit. The protons which coincide with Po lie within a 
sphere of radius lo. Because of their own apparent extension, the 
radius R of the nucleus which is unable to absorb a meson becomes 
3lo/2. In its dependence on the atomic weight A, the approximate 
value of R is given by R = (ly)/2)A”%. The value R = 3lo/2 therefore 
corresponds to A = 27, Z) ~ 13. But it is to be expected that the 
critical value of Zo will be somewhat smaller, since it should not be 
supposed that the nuclei are exactly spherical. 

It should be pointed out, however, that for a longitudinal vector 
meson the interaction is not determined by (grad )» but by (div z)o, 
which is not equal to zero. Therefore, for vector mesons a restriction 
on their absorption should not exist. 


Index 


Absorption of radiation, 225 
Adjunct function, 98 

Angular momentum, 49, 98, 170 
Antisymmetric state, 140, 150 
Antisymmetric tensor, 181 


Balmer, 66 

Bartlett, 256 

Bethe, 254 

Bohr’s theory, 36, 43, 70 

Born, 33 

Bose-Einstein statistics, 150, 159 
de Broglie waves, 33 

de Broglie’s equation, 184 


Causality, 8, 118 

Classical mechanics, 17 

Classical theory of radiation, 204 
Commutation relation, 103 
Continuity equation, 46 
Contraction of tensors, 181 
Contravariant tensor, 180 
Conversi, 287 

Correspondence principle, 76 
Covariant tensor, 180 


Degeneracy, 67 

Dirac’s equation, 165 
Divergences of the theory, 229 
Dynamical law, 118 


Eigenfunction, 36 

Higenvalue, 36 

Electromagnetic field, quantization of, 
212 

Emission of radiation, 78, 225 

Energy, measurement of, 37 

Exchange force, 247 

Expectation value, 45 


Fermi-Dirac statistics, 150, 159 
Fine structure, 75 
Franck, 37 


291 


Frohlich, 260 

Function of matrices, 98 

Fundamental problem of matrix me- 
chanics, 111 


Geometrical method of wave mechanics, 
25 

Gerlach, 72 

Gordon’s equation, 176 

Goudsmit, 72 

Group velocity, 20 


H atom, 62 

Hafstad, 245 
Harmonics, spherical, 63 
He atom, 145 
Heisenberg, 3, 6, 30, 83 
Heitler, 260 

Hermitean form, 106 
Hermitean polynomial, 58 
Hertz, 37 

Heydenburg, 245 

Hole theory, 173 


Interpretation of matrices, 100 
Isotopic spin, 239 


Jacobi’s equations, 22, 29 


Kellog, 251 
Kemmer, 232, 255 
Klein’s equation, 176 


Magnetic moment of proton, 266, 284 
Majorana, 256 
Many-body particle picture, 149 
Many-body problem, 136 
Many-body wave picture, 154 
Matrices, 86 

addition of, 86 

diagonal, 88 

dual, 88 

function of, 98 


292 


Matrices, Hermitean, 90 
interpretation of, 100 
multiplication of, 86 
transformation to principal axes, 91 
Matrix mechanics, fundamental prob- 
lem of, 111 
Measurement of energy, 37 
Mesons, 231 
scattering of, 257 
Moeller, 245 


Normalization, 42 
Nuclear forces, 241 
potential of, 245 


Observable, representation of, by ma- 
trix, 101 

Operator, 53, 56, 110 

Orthogonality, 43 

Oscillator, 57 


Pancini, 287 
Perturbation, 127 

of degenerated systems, 130 

of non-degenerated systems, 127 
Piccioni, 287 
Polarization rule, 79 
Potential of nuclear forces, 245 
Principle of transformation, 49 
Probability amplitude, 54 
Probability waves, scattering of, 30 
Proca’s equation, 184 
Proton, magnetic moment of, 266, 284 
 Pseudoscalar, 183 
Pseudoscalar field, 183 
Pseudovector, 183 
Pseudovector field, 188 
Pure case, 40 


Quantization, of electromagnetic field, 
212 
of scalar field, 194 
of vector field, 199 
of wave fields, 191 


Rabi, 251 

Radiation, absorption of, 225 
classical theory of, 204 
emission of, 78, 225 

Ramsey, 251 


INDEX 


Rayleigh’s theorem, 15 
Rosenfeld, 245 
Rutherford’s formula, 32 


Scalar field, 176 
quantization of, 194 
Scattering, of mesons, 257 
of probability waves, 30 
Schroedinger’s equation, 35 
Selection rule, 79 
Self-energy, 229 
transversal, 230 
Spherical harmonics, 63 
Spin, 72, 146, 170 
Spinor, 172 
State, representation of, by vector, 84 
Stationary states, 38 
Statistical cloud, 43 
Statistics, Bose-Einstein, 150, 159 
Fermi-Dirac, 150, 159 
of measurements, 38 
Stern, 72 
Symmetric states, 140, 150 


Tensor, calculus, 180 

Tensors, antisymmetric, 181 
contraction of, 181 
contravariant, 180 
covariant, 180 

Time dependence, of matrices, 121 
of probability amplitudes, 55 

Transformation, of matrix to principal 

axes, 91 

principle of, 49 

Transition probability, 76 

Transversal self-energy, 230 

Tuve, 245 


Uhlenbeck, 72 
Uncertainty relations, 3 


Vector field, quantization of, 199 


Wave field, quantization of, 191 

Wave mechanics, geometrical method 
of, 25 

Wave packets, 14 

Waves, de Broglie, 33 


Zacharias, 251 


